Skip to content
David Helkowski edited this page Dec 10, 2020 · 7 revisions

Related Projects

CLI Tools to control iOS devices

via Apple Private APIs

via reverse engineered USB protocol

  • libimobiledevice

IOS Video related links

Lockdown / DTX Links

Components

The stf_ios_support repo is composed of a few different things:

  • coordinator
  • Makefile to build everything
  • config.json
  • video_enabler

Coordinator

The coordinator is a multithreaded Golang program designed to coordinate all of the different running processes needed to provide an IOS device to a running STF Matrix ( server side ) The coordinator runs the following processes immediately when it is started:

  • stf_provider
  • device_detector
  • video_enabler

The coordinator runs a set of the following processes when a IOS device is plugged in via USB:

  • stf_device_ios unit
  • wda_proxy ( which in turn runs WebDriverAgent )
  • ffmpeg
  • mirror_feed

config.json

The config.json configuration file provides the settings for all of the components being run. The following things read settings from it:

  • coordinator
  • Makefile
  • mirror_feed

video_enabler

The video enabler is a few lines of Objective C code that make a call to XCode API that enable video to be able to be streamed from IOS devices. The information provided from Apple about these lines of code says they can go into the program that uses AVFoundation to read the video data. That is true, but if they are only run from the program reading from AVFoundation the video is not activated properly in all cases.

If a separate program ( the video enabler ) also runs these lines of permission code, in addition to having them in a program using AVFoundation to read IOS video, then everything works reliably.

stf_device_ios unit

For every device ( Android or IOS ) connected to a "provider machine" providing devices out to an STF Server, there needs to be a running "device unit". Upstream OpenSTF project provides a "device unit" for Android devices, but does not provide a "device IOS unit". Such a unit has been coded to work the same way an Android device unit works. The device IOS unit is started this way:

  1. The device_detector "detects" that a USB IOS device was plugged in
  2. The device_detector makes a HTTP ajax call to the coordinator "/dev_connect"
  3. The coordinator begins starting all of the various processes to support the device
  4. The coordinator sends a ZeroMQ message to the stf_provider notifying it of a device connect
  5. As the last process started for the device, the coordinator starts the stf_device_ios unit

This process is a bit different from how the original OpenSTF provider works. The normal Android process is this:

  1. The provider uses ADB to detect that a device has been connected
  2. THe provider starts the stf_device unit

The process has been changed for IOS to enable the coordinator to have full control over all the running processes. Currently the NodeJS STF device unit and provider code have been repurposed to work with IOS, but it would make sense over time to migrate all the functionality into Golang, so that there is not a mixture of languages providing IOS support.

What the device_ios_unit does in general is the following:

  1. Registers the IOS device against the STF Server
  2. Listens for commands from STF Server
  3. When commands are received ( such as a click command ) they are generally translated into calls to WebDriverAgent. The calls are made and the results returned to STF Server.
  4. In some cases commands are turned into runs of LibIMobileDevice CLI programs instead of WDA calls.

Essentially, the device_ios_unit translates generic STF instructions to IOS instructions

wda_proxy

A WDA proxy is used to expose the WebDriverPort onto multiple interfaces instead of just localhost, so that the WDA can be communicate with by the STF server. The WDA proxy has been modified to support running multiple devices / WDAs simultaneously on a single provider machine.

The WDA proxy has additionally been modified to log all incoming request details as well as the outgoing response from WebDriverAgent, for diagnostic purposes.

ffmpeg

Rather than using a program running on the IOS device to feed out a video stream, video is read out using ffmpeg directly over USB. This is possible because IOS already has built in video output feed code for streaming over USB. That code is primarily there to make it possible to use an HDMI output adapter. Apple has nicely also made it possible to get that same video streaming using AVFoundation API.

Standard ffmpeg already has support for reading video from AVFoundation. It does not have the magical API calls to enable that to work with IOS devices. Those few lines of code have been added to ffmpeg in order to enable this.

The coordinator uses ffmpeg to transcode the video output from the phone ( H264 ) into a MJPEG stream. That MJPEG stream is sent to mirror_feed using a named pipe.

mirror_feed

The mirror feed program receives an MJPEG video stream from ffmpeg. It parses that into individual jpeg frames, and queues them up for sending to a frontend listener over a websocket. When a websocket connection is receives, and as it remains open, the mirror_feed program streams the frames out that were queued up.

The mirror feed program is named such because it could theoretically mirror video frames to multiple places simultaneously. ( such as recording them at the same time as showing them in STF ) Right now it is only used to show video in STF and the simultaneous mirroring ability has not been tested ( planned for already... )

Currently the framerate is hardcoded to be 1fps. This is to reduce bandwidth and is not a limitation. It could be increased and continues to function without issue at higher framerates ( also consuming more bandwidth )

Overview Diagram