I have no idea why, but back when 2021 first started, I was listening to Speed of Light by DJ Okawari (you know, that Arknights song). The video was basically just a looping gif of Texas flying a drone with Exusiai, and for some reason I ended up really wanting a drone. So I bought one from DJI that could be programmed with Python, the Tello EDU. I've been meaning to do this project for a while now, so here it is. A GUI made using Tkinter for my DJI Tello EDU, named Flydo (after Fido from 86).
To get started, you first need a trained Pytorch model (since I couldn't upload the large saved .pth file).
model/train.py
Then you can run the actual GUI:
app.py
-f https://download.pytorch.org/whl/torch_stable.html
torch==1.10.1+cu113
torchvision==0.11.2+cu113
torchaudio===0.10.1+cu113
djitellopy==2.4.0
pycocotools-windows==2.0.0.2
pygame==2.1.2
python-opencv==4.5.5.62
djitellopy was implemented based on the official DJI Tello/Tello EDU SDK, and allows for easy implementation of all tello commands, simple retrieval of the drone's video stream, and parsing/receiving of state packets. All drone commands were implemented using this SDK.
Tkinter was used to create the GUI for controlling the drone. Different drone commands were implemented with multithreading so as to prevent the visual display from freezing while the drone is moving (this took me a while to figure out how to do). The result is shown below.
Pytorch's pretrained Faster R-CNN model was used for the head-tracking feature. The model was fine-tuned using transfer learning on a custom dataset. It actually works a lot better than I thought it would, especially given the limited amount of training data and epochs!
Coco-Annotator is an image-labelling tool created by felixdollack that I also use for my research. It outputs data in Coco format for object detection and instance segmentation, and I used it to label my custom dataset that I collated using Flydo's own camera. It's basically just a bunch of pictures of me in my room.
Flydo can now take screenshots and videos!
Flydo's tracking function now generates and overlays bounding boxes!
app.py's associated functions have been reformatted into class methods. All global variables have been removed and replaced as class variables.
Apparently the issue with the video feed latency has to do with the djitellopy library itself, and there isn't a way to avoid this unless you switch to TelloPy. I'm currently in the middle of midterms, but I'll make sure to do this soon.
- Create a function to move Flydo based on output of object detector.
- Improve accuracy of tracker (i.e. the object detection model).
- Figure out how to decrease latency between drone feed and visual display.
Thanks to Takashi Nakamura, PhD, for writing this article showing me the basics on how to train Faster R-CNN with Pytorch.
This project is licensed under the MIT License - see the LICENSE file for details.