Skip to content

Meta-Identity @ hackUMass 2022 | Winner: 'Best AI/ML Hack'

Notifications You must be signed in to change notification settings

sacredvoid/meta-identity

Repository files navigation

Meta-Identity ๐Ÿš€

HackUMass Winner Python Streamlit TensorFlow License

Meta-Identity

Create Your Digital Clone for the Metaverse ๐ŸŽญ

Meta-Identity is an innovative AI-powered platform that creates digital avatars that look, sound, and behave like you. This project won "Best AI/ML Hack" at HackUMass '22 and represents a breakthrough in digital identity creation for the metaverse.

๐ŸŒŸ Features

  • ๐ŸŽญ Digital Avatar Creation: Transform your photos into animated digital avatars
  • ๐Ÿ—ฃ๏ธ Voice Cloning: Generate speech that sounds like your own voice
  • ๐Ÿง  Personality Cloning: Train AI on your chat data to replicate your personality
  • ๐ŸŽฌ Video Generation: Create talking head videos with lip-sync
  • ๐Ÿ“ฑ Multi-Platform Integration: WhatsApp and SMS capabilities via Twilio
  • โ˜๏ธ Cloud Deployment: Scalable Google Cloud Platform integration

๐Ÿ† Awards & Recognition

  • ๐Ÿฅ‡ Best AI/ML Hack - HackUMass '22
  • ๐Ÿ“ฑ Featured on Devpost - View Project

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8+
  • Google Cloud Platform account
  • Twilio account (for messaging features)
  • Hugging Face API token

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/meta-identity.git
    cd meta-identity
  2. Install dependencies

    pip install -r requirements.txt
  3. Set up environment variables Create a Secrets.py file with your API credentials:

    API_TOKEN = "your_huggingface_token"
    TOON_KEY = "your_rapidapi_key"
    GCP_BUCKET_NAME = "your_gcp_bucket"
    GCP_CREDENTIALS = "path_to_gcp_credentials.json"
    STORAGE_LINK_HEADER = "https://storage.googleapis.com/"
  4. Run the application

    streamlit run app/app.py

๐Ÿ—๏ธ Architecture

Meta-Identity combines multiple cutting-edge AI technologies:

Core Components

  1. Personality Cloning ๐Ÿง 

    • Uses GPT-2 architecture for conversational AI
    • Trains on WhatsApp, Facebook, and Instagram chat data
    • Maintains conversation context and personality traits
  2. Voice Cloning ๐Ÿ—ฃ๏ธ

    • GAN-based voice spectral transfer learning
    • Text-to-speech with personalized voice characteristics
    • Gender-specific voice modulation
  3. Face Animation ๐ŸŽญ

    • StyleGAN and StackGAN for image generation
    • RNN LSTM for audio-visual correlation
    • Real-time lip-sync with speech
  4. Video Synthesis ๐ŸŽฌ

    • Combines animated face with cloned voice
    • Generates MP4 output with synchronized audio
    • Cloud-optimized processing pipeline

Technology Stack

  • Frontend: Streamlit, React (TypeScript)
  • Backend: Python, Flask
  • AI/ML: TensorFlow, PyTorch, Hugging Face Transformers
  • Cloud: Google Cloud Platform, Google Cloud Storage
  • APIs: Twilio (SMS/WhatsApp), RapidAPI (3D Cartoon Face)
  • Audio Processing: SpeechRecognition, pyttsx3, gTTS

๐Ÿ“ Project Structure

meta-identity/
โ”œโ”€โ”€ app/                          # Main application
โ”‚   โ”œโ”€โ”€ app.py                   # Streamlit main application
โ”‚   โ”œโ”€โ”€ MetaIdentity.py          # Core AI functionality
โ”‚   โ”œโ”€โ”€ gcp_helpers.py           # Google Cloud integration
โ”‚   โ”œโ”€โ”€ sr_audio_recorder.py     # Audio recording component
โ”‚   โ”œโ”€โ”€ faceanimator/            # Face animation library
โ”‚   โ”‚   โ”œโ”€โ”€ sda/                 # Speech-driven animation
โ”‚   โ”‚   โ””โ”€โ”€ main.py              # Animation examples
โ”‚   โ”œโ”€โ”€ st_audiorec/             # Audio recorder frontend
โ”‚   โ””โ”€โ”€ assets/                  # Media files
โ”œโ”€โ”€ Twilio Bot/                  # WhatsApp/SMS integration
โ”‚   โ””โ”€โ”€ app.py                   # Flask bot server
โ”œโ”€โ”€ requirements.txt             # Python dependencies
โ””โ”€โ”€ README.md                    # This file

๐ŸŽฏ How It Works

1. Data Collection

  • Upload your photo for avatar creation
  • Record voice samples for voice cloning
  • Provide chat data for personality training

2. AI Processing

  • Image Processing: Convert photo to cartoon-style avatar
  • Voice Analysis: Extract voice characteristics and patterns
  • Personality Training: Process chat data to understand communication style

3. Digital Clone Generation

  • Generate personalized responses using trained models
  • Convert text to speech with cloned voice
  • Create animated video with lip-sync

4. Output

  • Interactive digital avatar
  • Video files with synchronized audio
  • WhatsApp/SMS integration for remote interaction

๐Ÿ”ง Configuration

Model Selection

The face animator supports multiple pre-trained models:

  • grid - GRID dataset (default)
  • timit - TCD-TIMIT dataset
  • crema - CREMA-D dataset
  • lrw - LRW dataset

Voice Settings

  • Male voice (default)
  • Female voice (for "HappyWoman" character)
  • Adjustable speech rate and volume

๐ŸŒ API Integration

Required APIs

  1. Hugging Face - For DialoGPT conversation model
  2. RapidAPI - For 3D cartoon face generation
  3. Google Cloud - For file storage and processing
  4. Twilio - For WhatsApp and SMS capabilities

Rate Limits

  • Hugging Face: 1000 requests/month (free tier)
  • RapidAPI: Varies by plan
  • Google Cloud: Pay-per-use
  • Twilio: Varies by plan

๐Ÿš€ Deployment

Local Development

# Start the main application
streamlit run app/app.py

# Start the Twilio bot (separate terminal)
cd "Twilio Bot"
python app.py

Cloud Deployment

  1. Deploy to Google Cloud Run
  2. Set up Google Cloud Storage bucket
  3. Configure environment variables
  4. Set up Twilio webhook endpoints

๐Ÿค Contributing

We welcome contributions! Please see our contributing guidelines:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ‘ฅ Team

  • Samanvya - AI/ML Engineer
  • Rajat - Full Stack Developer

๐Ÿ”ฎ Future Roadmap

  • AR/VR Integration: Move from 2D to 3D avatars
  • Real-time Processing: Live avatar generation
  • Enhanced Personality: More sophisticated personality modeling
  • Multi-language Support: Support for multiple languages
  • Mobile App: Native mobile application

๐Ÿ› Known Issues

  • Audio recording may fail on some systems
  • Large file uploads may timeout
  • Voice cloning quality varies by input quality

๐Ÿ“ž Support

For support and questions:

  • Create an issue on GitHub
  • Contact us via Devpost

๐Ÿ™ Acknowledgments

  • Hugging Face for the DialoGPT model
  • Google Cloud Platform for infrastructure
  • Twilio for messaging capabilities
  • The open-source community for various libraries

Built with โค๏ธ for the future of digital identity

Meta-Identity - Where You Meet Your Digital Self ๐Ÿš€

About

Meta-Identity @ hackUMass 2022 | Winner: 'Best AI/ML Hack'

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •