Skip to content

RubyQianru/Machine-Learning-TFJS

Repository files navigation

Machine Learning

  • This is the repo for the Machine Learning for TensorFlow, TensorFlowJS experiments in Computer Vision. This class features experiments with browser-based machine learning.

AI Live Meeting

Preview

User testings with two classes of 20+ students.

Tech Stack

Responsibility

  • Spearheading real-time HTTPS communication for web and mobile, enabling live video transmission and AI prediction.
  • Implemented neural networks leveraging TensorFlow hand pose recognition, achieving 80%+ accuracy in classification.
  • Integrated automated data collection system, enhancing user experience, and boosting operational efficiency by 50%.

Design Thinking

  • What problem am I trying to address❓ I noticed the difficulty to quickly interact with peers during multi-user livestream videos (e.g. Zoom, Google meet). For example, in a online class scenario, if a user want to raise hand to ask a question, the user has to click the emoji button -> select emoji -> deselect emoji (three steps) to complete the user flow of interaction with the professor.
  • How can AI help to solve this problem ❓ An AI algorithm, potentially computer vision to classify users’ hand postures, and to directly emit signals to the peers.
  • What data is needed to create an AI to help address the issue ❓ A series of input data that is able to precisely conclude humans’ hand postures.

Data Collection

First prototype

Link

This prototype is based on Daniel Shiffman's The Coding Train. I reduced data collection wait time, and extended data collection time, so that the data collection system can automatically input more data samples at a time. This design upgraded the user experience of data collection.

Second Prototype

Link

This prototype is based on TensorFlow Handpose and MediaPipe V2. It has higher performance and lower latency than the previous prototype.

Model Training

  • Deep Learning model trained with Jupyter Notebook: Link
  • util.py: python funtions to load data (load json data into numpy arrays, shuffle data), preprocess data (slice X_train, y_train into train sets and validation sets), build model (establish neural networks), test model.
  • main.ipynb: main workflow to train machine learning model step by step.
  • Model summary:
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 32)                2048      
                                                                 
 dense_1 (Dense)             (None, 4)                 132       
                                                                 
=================================================================
Total params: 2180 (8.52 KB)
Trainable params: 2180 (8.52 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
  • Model accuracy: 0.89552241563797

Model Predictions: TensorFlowJS Usage on the Frontend

Run Server

  • Linux commands:
ssh root@qz2432.itp.io
root@qz2432.itp.io's password: 
root@ruby-zhang:~# cd ./live-web/week5
root@ruby-zhang:~/live-web/week5# node server.js

qz2432.itp.io:

  • Full code examples: Link
  • Frontend code examples: Link

Realtime Communication: WebRTC

  • Implementing a video chat application: Zoom, Microsoft Teams, Google Meets.
  • Technology: WebRTC provides APIs for capturing audio and video streams from the user's camera and microphone. These streams can be transmitted in real-time between peers, enabling video and audio calls directly in the browser without the need for third-party plugins.
  • Experience: Participants can join meetings via web browsers or dedicated applications on various devices.
  • Live Chatbox created using gsap library and DOM
  • Live video prototype using WebSocket
  • I tested the web application on webcams of my two laptops. This live video prototype is basing on HTML and . The web sockets receives canvas data and emit this data to all other clients. All clients update their src within .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published