# Computer Vision Feasibility Analysis
## *Framerate and Camera Requirements*
*Matt Schneider 10/6/25*

### Calculating required FPS

To determine how fast of a refresh rate our camera must have for good tracking results, we need to be able to set some requirements on how fast the target must move and how much inter-frame motion we allow. Similarly, we need to put limits on the blur for reliable centrioding or position estimation in the frame.

The drones lateral motion can be translated into an angular rate to determine its angular rate in pixels. Given the drone is moving lateraly to the camera, the angluar rate is given by:

$$
\omega \approx \frac{v_\bot}{R}
$$

Where $v_\bot$ is the drones lateral velocity in meters per second, and $R$ is the range. Then the pixel rate $\dot{p}$ is given by:

$$
\dot{p} = \omega \cdot \frac{N_{px}}{FOV_{rad}}
$$

We want the FPS of the camera to be greater than the ratio of $\frac{\dot{p}}{p_{max}}$ where $p_{max}$ is the allowable inter frame motion. I will be aiming to have no more than 5 pixels of inter frame motion for the following analysis. ($N_{px}$ is the width of the image in pixels)

For blur, we want the drone to move no more than 1 pixel over the time it takes for 1 frame to be captured (exposure time). This means we want our exposure time to be roughly less than $\frac{1}{\dot{p}}$.

I will model the drone being targeted as a DJI phantom 3 flying at 15 m/s, which is roughly 50cm across.

#### Plugging in the numbers

First angular rate at our target distance of 500ft (152m) at velocity of 15m/s

$$
\omega \approx \frac{15}{152} \approx 0.09868 \text{ rad/s}
$$

Using $p_{max} = 5$ px and for this specific calculation we use a image size of 1920 and a FOV of $60$° or 1.0472 rad:

$$
\dot p = 0.09868 \cdot \frac{1920}{1.0472} = 180.9336 \text{ px/sec}
$$

Then to calculate desired FPS:

$$
\text{FPS} \ge \frac{180.9336}{5} = 36.1867 
$$

Now to calculate exposure time per frame to have as little blur as possible

$$
\text{exposure time} = \frac{1}{180.9336} = 0.0055269 \text{ sec} = 5.52689 \text{ ms}
$$

This is likely a reasonable target for us to reach if we can actually have the needed number of pixels for using our tracking and detection algorithms at this 500ft distance using a wide $60$° FOV lens. This wide of a FOV is likely not possible to give us the apparent size we need to run our algorithms to be able to detect and then track the drone. But at least our FPS requirements are something within reasonable specs for camera modules that are not specialty sensors.

#### Script for calculating FPS based on camera specs

All code for the `CameraClassifier` class is on [our GitHub](https://github.com/luckyowl20/PSS_CUAS/blob/master/CameraClassifier.py). This notebook is also on there.

In [20]:
# helper code
import numpy as np

# homemade camera classifier class
from CameraClassifier import Camera

def ftom(feet):
    return feet * 0.3048

In [21]:
# numbers to match the above calculations
p_um = 3.0
f_mm = 5.0
sizes = [1920]
range = ftom(500)
velocity = 15.0
p_max = 5.0
object_size = 0.5

camera = Camera("demo cam", sizes, p_um, f_mm, p_max)
results = camera.run_analysis(velocity, range, object_size, print_results=True)

Analysis for: demo cam at 5.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
1920       36.16                5.53  59.88                5.47


### An analysis of some camera modules that we may use
Cameras in order of analysis:
1. [Tech Nation: VLS-GM2-AR1335-C-CB-IR](https://www.technexion.com/shop/serdes/gmsl/gmsl2/vls-gm2-ar1335-c-cb-ir/) these sit around $100 without lenses

2. *these will not be cheap* [NileCAM82 - Sony® STARVIS™ IMX485 4K GMSL2 Camera Module](https://www.e-consystems.com/camera-modules/sony-starvis-imx485-4k-gmsl2-camera-module.asp)

3. *these will not be cheap* [NileCAM87_CUOAGX](https://www.e-consystems.com/nvidia-cameras/jetson-agx-orin-cameras/4k-sony-starvis2-imx585-gmsl-camera.asp) Datasheets on this sensor are extremely limited and will need further research.

In [22]:
# analysis for camera 1
# https://www.technexion.com/shop/serdes/gmsl/gmsl2/vls-gm2-ar1335-c-cb-ir/
import pandas as pd

# FPS for these sizes:
# 60, 30, 15, 10
sizes = [1920, 2560, 3480, 4208]

p_um = 1.1
f_mm = 6.0
range = ftom(500)
velocity = 15.0
p_max = 5.0
object_size = 0.5

camera = Camera("cam 1", sizes, p_um, f_mm, p_max)
results = camera.run_analysis(velocity, range, object_size, print_results=True)

Analysis for: cam 1 at 6.0 mm lens
              FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                       
1920       108.47                1.84  19.96                17.9
2560       109.32                1.83  26.41                17.9
3480       110.92                1.80  35.39                17.9
4208       112.50                1.78  42.19                17.9


#### Results from camera 1
Since this camera's fastest frame rate is 60 fps at the 1920x1080 image size, this camera cannot deal with our requirements unless we seriously change the FOV to something wider.

In [23]:
# analysis from camera 2 
# https://www.e-consystems.com/camera-modules/sony-starvis-imx485-4k-gmsl2-camera-module.asp

# supported FPS: 62, 39
sizes = [1920, 3840]

p_um = 2.9
f_mm = 6.0
range = ftom(500)
velocity = 15.0
p_max = 5.0
object_size = 0.5

camera = Camera("cam 2", sizes, p_um, f_mm, p_max)
results = camera.run_analysis(velocity, range, object_size, print_results=True)

Analysis for: cam 2 at 6.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
1920       43.50                4.60  49.78                6.79
3840       50.52                3.96  85.72                6.79


#### Results from camera 2
This camera is starting to be able to fill our requirements. It would be great if we could have a 4k input stream so that we can have a much wider field of view. Similarly, it would be nice if I had access to the binning data for this sensor so we could really know the true apparent size of the drone in the 1920 image size.

#### Notes about camera 3
This is likely the most expensive of the group, estimated to be 200-500$ per unit via basic google searches for the sensor chip that is within the unit.

This sensor has a special window cropping mode where we can pick the size of the image we want to take. This lets us take bespoke size images so that we can increase frame rate and not read out the entire image.

This sensor also has the highest entire sensor readout rate of any of the 3. The entire 3840x2160 image can be given out at 60FPS for full 12 bit color or 90FPS for reduced 10 bit color. This fills the nice to have requirement learned from the second camera I looked at in this analysis. And is almost overkill as we do not need 12 bit color and also do not need 90FPS at 4k. We would likely be unable to process this data stream as it is, let alone if we double our cameras to decrease our FOV for better tracking.

In [24]:
# camera 3 analysis - datasheet is very limited
# https://www.e-consystems.com/nvidia-cameras/jetson-agx-orin-cameras/4k-sony-starvis2-imx585-gmsl-camera.asp
sizes = [3840]

p_um = 2.9
f_mm = 6.0
range = ftom(500)
velocity = 15.0
p_max = 5.0
object_size = 0.5

camera = Camera("cam 3", sizes, p_um, f_mm, p_max)
results = camera.run_analysis(velocity, range, object_size, print_results=True)

Analysis for: cam 3 at 6.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
3840       50.52                3.96  85.72                6.79


#### Results for camera 3
This camera requires more research into the price (pending quote request) and if we can utilize multiple of these cameras at a lower FOV to increase our tracking distance. This is promising and shows that we are able to reach the ideal specs described below and start to run into a new problem: money and compute.

### Ideal camera specs from our current requirements
Quick refresher of our current requirements
- Maximum inter-frame movement of drone: 5 pixels
- Maximum blurring of each frame: ~1px

The following code cell is an ideal camera spec sheet that gives us the desired calculations as done above for those requirements. I will attempt to restrict our $FOV_{min}$ to no less than $45$°.

In [25]:
# ideal camera spec simulation
p_um = 3.0 # havent seen anything larger than this in my research
f_mm = 6.0 # available lens sizes: 2.8, 4, 6, 8, 12, 16

range = ftom(500)
velocity = 15.0
p_max = 5.0
object_size = 0.5

# image sizes will scale with FPS, smalester sizes have higher FPS
sizes = [1920, 2560, 3480, 4208]

camera = Camera("ideal cam", sizes, p_um, f_mm, p_max)
results = camera.run_analysis(velocity, range, object_size, print_results=True)

Analysis for: ideal cam at 6.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
1920       42.23                4.74  51.28                6.56
2560       44.26                4.52  65.24                6.56
3480       47.84                4.18  82.05                6.56
4208       51.09                3.91  92.90                6.56


### Conclusions - what this means for our cameras and/or requirements

We either need to find cameras and lenses that fufill these technical requirements:
1. 100+ FPS for small pixel pitch ~$1.0-1.5\mu m$
2. Large pixel pitch ~$3.0 \mu m$
3. Global shutter *(less needed if FPS requirement is lower)*

Or we can adjust our requirements to something physically possible within our budget restrictions
1. Increasing maximum blur tolerance
2. Increasing allowable inter frame motion
3. Reducing tracking distance
4. Increasing expected drone size
5. Decrease expected drone velocity



### Analysis with camera 2 and reduced drone constraints

The expected lateral velocity of the drone has been halved and the allowed inter frame movement has been doubled, blur requirements stay the same.

Analysis for a 4mm, 6mm, and 8mm lens.

In [26]:
# supported FPS: 62, 39
sizes = [1920, 3840]

p_um = 2.9
f_mm = 6.0
range = ftom(500)
velocity = 7.5
p_max = 10.0
object_size = 0.5

lenses = [4.0, 6.0, 8.0]

camera = Camera("reduced constraints cam 2", sizes, p_um, f_mm, p_max)
for lens in lenses:
    camera.f_mm = lens
    camera.run_analysis(velocity, range, object_size, print_results=True)


Analysis for: reduced constraints cam 2 at 4.0 mm lens
            FPS  Exposure Time (ms)    HFOV  Apparent Size (px)
Size (px)                                                      
1920       7.77               12.87   69.68                4.53
3840       9.97               10.03  108.61                4.53
Analysis for: reduced constraints cam 2 at 6.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
1920       10.87                9.20  49.78                6.79
3840       12.63                7.92  85.72                6.79
Analysis for: reduced constraints cam 2 at 8.0 mm lens
             FPS  Exposure Time (ms)   HFOV  Apparent Size (px)
Size (px)                                                      
1920       14.11                7.09  38.38                9.05
3840       15.54                6.44  69.68                9.05


#### Reduced constraints results
It is clear that we can certainly push to aim for the worst conditions and still be able to have our cameras at or within their framerate limits. The deciding factor for choosing the limits proposed in the earlier part of this analysis is the focal length of the lens.

### Limitations about this analysis
Since datasheets do not contain implementation details about how many pixels are used for each frame size, it is difficult to determine how the apparent size of the drone will change with each cameras frame rate and image size intervals. I expect that some sort of cropping or binning will take effect. 

If a sensor simply crops the image, then the apparent size does not change since the same number of effective pixels are still used to represent a drone. This is equivelant to a lower FOV lens.

If a sensor is using [binning](https://en.wikipedia.org/wiki/Pixel_binning), where groups of pixels on the sensor are used to represnt a single larger pixel at lower resolution, the apparent size of the drone will decrease. Sometimes this can cause the signal to noise ratio to increase. This will make it harder for us to run our algorithms on drones that take up fewer pixels, furthering the need for tighter FOV lenses. Most sensors do use this technique at lower resolutions. This analysis did not implement binning as it is hard to find information on this for each camera.

### Conclusion

Computer vision is definitely possible for this project and quite feasible from the camera technology perspective. The budget limits of this project will definitely be a room for improvement we can show to the client if it comes down to it. We will be pushing the limits of what we can get for our money in relation to frame rates. The biggest lesson learned from this analysis is that larger pixel pitch sensors do a much bettor job at the ranges we are expecting. We are able to rely on lower resolution images to get the frame rates we need.


I believe that we still need tighter FOV to get the required apparent size for robust and reliable drone tracking algorithms to work effectively. Ideally we should have an effective size of at least 15-20 px to be able to run tracking. This number is based on some preliminary research into what methods we require to be able to track. Simlalry, Kenji's report seems to lead to a similar conclusion that we need roughly 20 pixels for a 3 degree accuracy for our computer vision system to be accurate enough. The trade off of this is that we require higher frame rates for this to function as intended since the higher the FOV, the more the drone moves between frames.

Although this is good news, we run into a new problem. The Nvidia Jetson Orin Nano board will likely not be able to handle running our algorithms over multiple camera data streams at these rates. 4k 50fps is almost not possible, and 1920 at 60fps leaves very little room for actually running our algorithms **and** being able to coordinate with all the other systems, networking, launcher kinematics, etc. We would likely need to upgrade our compute module in order to make this robust CV system work as we want it to. This again means another huge increase in cost for this project. The next step up in compute capacity of the Nvidia Jetson series is the Nvidia Jetson Orin NX board or the older Xavier NX board. 