Skip to content

Neilblaze/GSOC-23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description


Project Details

An interactive web app which enables users to perform contactless interactions with the interface using simple human gestures. ✨

Background: The COVID-19 pandemic has increased awareness of hygiene risks associated with touchscreens, with reports indicating that 80% of people find them unhygienic. Touchless gesture-based intuitive systems can reduce transmission in public settings and workplaces, and offer a seamless and convenient experience. Touchless technology is expected to remain popular in various industries, such as retail, healthcare, and hospitality.

The web app highlights a special ATM which showcases an augmented transaction panel, enabling users to interact accurately through intuitive gestures detected from an input video feed. Users can perform essential operations directly through the interactive floating panel (on screen) via custom simple-to-use gestures, allowing them to experience the checkout process without the need for physical touch.

Shortened Version
In a rapidly evolving technological landscape, the aftermath of the COVID-19 pandemic has amplified concerns regarding hygiene and touch-based interactions. With 80% of individuals deeming public touchscreens unhygienic, there is a compelling need for innovative solutions. Enter touchless gesture-based systems, poised to reshape industries and public spaces. Seamlessly aligning with the post-pandemic era, this technology offers intuitive and convenient interactions. From **ATMs** and airports to healthcare and retail, touchless interactions are on the brink of becoming ubiquitous. This project directly addresses these changing expectations by harnessing the power of the MediaPipe Hand Landmarker task from MediaPipe Solutions. By precisely detecting 21 key hand landmarks, this technology powers an interactive web application enabling users to effortlessly engage with interfaces through contactless gestures. Designed for optimal performance in well-lit environments and on larger screens, this project embodies the future of safer, more advanced interactions.

Project Report

Google’s MediaPipe Solutions helps developers add machine learning to their end-user devices, including mobile, web, and IoT. It provides a framework that lets you configure prebuilt processing pipelines that deliver immediate, engaging, and useful output to users.

The demo showcases the capabilities of the MediaPipe Hand Landmarker task, which accurately detects and tracks 21 hand landmarks. These landmarks are utilized in the web app to enable users to perform contactless interactions with the interface using simple gestures.


demoHeader

ⓘ Best experienced in well-lit environments. Ideal on larger screens. All data taken via input video feed is deleted after returning inference and is computed directly on the client side, making it GDPR compliant.


QR-Code

🔸 Play with the Live Demo → Here

🔸 Alternate CodeSandbox Template → Here

🔸 View the Installation Notes → Here

🔸 Explore the Official Repository → Here


💡 The source code for the demo includes detailed comments that explain the implementation and rationale behind the design decisions.


Contributions

Throughout the summer, I have made multiple contributions to MediaPipe. It's to be noted that chunks of git commits have been rebased into each of them, including:

  • [app]: MediaPipe Interactive Web Demo — Contactless ATM Playground, (PR #209)

    • Developed the interactive web demo utilizing MediaPipe's Tasks Vision API.
    • Created components, logic, and styles following React's modular architecture.
    • Implemented gesture recognition using hand landmarks for user interaction.
    • Engineered a custom logic for detecting the logic without using GestureRecognizer Task.
    • Added support for both hands while Landmark detection via HandLandmarker Task
    • Wrote custom components and created different pages for the app.
    • Integrated React Redux for state management and UI updates.
    • Utilized React Toastify for displaying user notifications.
    • Handcrafted UI assets for the app using Figma, Adobe Photoshop, Illustrator & After Effects.
    • Fixed bugs & removed redundant code and optimized imports for better performance.
    • Documented the app tutorial on my blog.
    • Deployed the web demo on Netlify and exported the same as a CodeSandbox template.

  • [feat]: Adding Offline Support for Interactive Web Demo, (PR #215)

    • Worked with Workbox & enhanced the web demo by adding offline support with a service worker.
    • Implemented stale-while-revalidate logic to dynamically update cached content upon service worker activation.
    • Provided an option to unregister the service worker for troubleshooting or maintenance.
    • Fixed minor UI bugs and refactored code.
    • Updated the project's README file with comprehensive instructions and usage guide.

  • [style]: Formatting & Asset Optimization, (WIP)

Tutorial & Blogs

Over the course of my GSoC journey, I've penned down blogs to share my insights, and here they are, presented in reverse chronological order:

No. Blog Title Description Link
1 Interactive Web Demo Step-by-step guide to a touchless interactive web demo. Link
2 Predicting Custom Gestures for Interactive Web Demo Exploring how to predict custom gestures for interactive demos. Link
3 A Holistic Preview of MediaPipe A comprehensive look into MediaPipe's capabilities and potential. Link
4 GSoC'23 Community Bonding Period Insights into community bonding during GSoC 2023 preparations. Link

ⓘ This documentation is intended to assist other developers in utilizing the MediaPipe library and implementing similar touchless interaction features in their projects.


dotted-bar-long

Architecture Overview 🔻

AppArchitecture

If you want to delve deep into the specs of the model, feel free to explore the official docs, which can be found here. You can access the official model card for MediaPipe Hands (Lite/Full) here. It provides detailed information about the model.


Interactive Web Demo 🔻

DEMOX.mp4

⚠️ Webcam is essential & required for hand detection and gesture recognition. Please ensure your device has a functioning webcam.


dotted-bar-long


References

[1] MediaPipe Hands Official Paper:   (LINK🔗)

[2] Applying Hand Gesture Recognition for User Guide Application Using MediaPipe (Paper):   (LINK🔗)

[3] MediaPipe Solutions API Docs:   (LINK🔗)


License

Copyright 2023 The MediaPipe Authors. Distributed under the Apache License 2.0. See LICENSE for more information.


Summary

Participating in Google Summer of Code (GSoC) for the first time was a fantastic experience. I'm deeply grateful to my mentor, Jen Person 👩, for this opportunity. Her invaluable feedback propelled the project.

Special Thanks to Paul Ruiz (@PaulTR) for providing immense support and guidance throughout the program, & Jason Mayes (@jasonmayes) for his valuable feedback on the proposal.

Beyond GSoC, I'm committed to ongoing contributions. Numerous exciting features remain to be explored. Count on me for consistent patches and updates to keep the project current. Feel free to connect on Twitter or LinkedIn for suggestions and feedback! 😄