Create Your Digital Clone for the Metaverse ๐ญ
Meta-Identity is an innovative AI-powered platform that creates digital avatars that look, sound, and behave like you. This project won "Best AI/ML Hack" at HackUMass '22 and represents a breakthrough in digital identity creation for the metaverse.
- ๐ญ Digital Avatar Creation: Transform your photos into animated digital avatars
- ๐ฃ๏ธ Voice Cloning: Generate speech that sounds like your own voice
- ๐ง Personality Cloning: Train AI on your chat data to replicate your personality
- ๐ฌ Video Generation: Create talking head videos with lip-sync
- ๐ฑ Multi-Platform Integration: WhatsApp and SMS capabilities via Twilio
- โ๏ธ Cloud Deployment: Scalable Google Cloud Platform integration
- ๐ฅ Best AI/ML Hack - HackUMass '22
- ๐ฑ Featured on Devpost - View Project
- Python 3.8+
- Google Cloud Platform account
- Twilio account (for messaging features)
- Hugging Face API token
-
Clone the repository
git clone https://github.com/yourusername/meta-identity.git cd meta-identity -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables Create a
Secrets.pyfile with your API credentials:API_TOKEN = "your_huggingface_token" TOON_KEY = "your_rapidapi_key" GCP_BUCKET_NAME = "your_gcp_bucket" GCP_CREDENTIALS = "path_to_gcp_credentials.json" STORAGE_LINK_HEADER = "https://storage.googleapis.com/"
-
Run the application
streamlit run app/app.py
Meta-Identity combines multiple cutting-edge AI technologies:
-
Personality Cloning ๐ง
- Uses GPT-2 architecture for conversational AI
- Trains on WhatsApp, Facebook, and Instagram chat data
- Maintains conversation context and personality traits
-
Voice Cloning ๐ฃ๏ธ
- GAN-based voice spectral transfer learning
- Text-to-speech with personalized voice characteristics
- Gender-specific voice modulation
-
Face Animation ๐ญ
- StyleGAN and StackGAN for image generation
- RNN LSTM for audio-visual correlation
- Real-time lip-sync with speech
-
Video Synthesis ๐ฌ
- Combines animated face with cloned voice
- Generates MP4 output with synchronized audio
- Cloud-optimized processing pipeline
- Frontend: Streamlit, React (TypeScript)
- Backend: Python, Flask
- AI/ML: TensorFlow, PyTorch, Hugging Face Transformers
- Cloud: Google Cloud Platform, Google Cloud Storage
- APIs: Twilio (SMS/WhatsApp), RapidAPI (3D Cartoon Face)
- Audio Processing: SpeechRecognition, pyttsx3, gTTS
meta-identity/
โโโ app/ # Main application
โ โโโ app.py # Streamlit main application
โ โโโ MetaIdentity.py # Core AI functionality
โ โโโ gcp_helpers.py # Google Cloud integration
โ โโโ sr_audio_recorder.py # Audio recording component
โ โโโ faceanimator/ # Face animation library
โ โ โโโ sda/ # Speech-driven animation
โ โ โโโ main.py # Animation examples
โ โโโ st_audiorec/ # Audio recorder frontend
โ โโโ assets/ # Media files
โโโ Twilio Bot/ # WhatsApp/SMS integration
โ โโโ app.py # Flask bot server
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
- Upload your photo for avatar creation
- Record voice samples for voice cloning
- Provide chat data for personality training
- Image Processing: Convert photo to cartoon-style avatar
- Voice Analysis: Extract voice characteristics and patterns
- Personality Training: Process chat data to understand communication style
- Generate personalized responses using trained models
- Convert text to speech with cloned voice
- Create animated video with lip-sync
- Interactive digital avatar
- Video files with synchronized audio
- WhatsApp/SMS integration for remote interaction
The face animator supports multiple pre-trained models:
grid- GRID dataset (default)timit- TCD-TIMIT datasetcrema- CREMA-D datasetlrw- LRW dataset
- Male voice (default)
- Female voice (for "HappyWoman" character)
- Adjustable speech rate and volume
- Hugging Face - For DialoGPT conversation model
- RapidAPI - For 3D cartoon face generation
- Google Cloud - For file storage and processing
- Twilio - For WhatsApp and SMS capabilities
- Hugging Face: 1000 requests/month (free tier)
- RapidAPI: Varies by plan
- Google Cloud: Pay-per-use
- Twilio: Varies by plan
# Start the main application
streamlit run app/app.py
# Start the Twilio bot (separate terminal)
cd "Twilio Bot"
python app.py- Deploy to Google Cloud Run
- Set up Google Cloud Storage bucket
- Configure environment variables
- Set up Twilio webhook endpoints
We welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Samanvya - AI/ML Engineer
- Rajat - Full Stack Developer
- AR/VR Integration: Move from 2D to 3D avatars
- Real-time Processing: Live avatar generation
- Enhanced Personality: More sophisticated personality modeling
- Multi-language Support: Support for multiple languages
- Mobile App: Native mobile application
- Audio recording may fail on some systems
- Large file uploads may timeout
- Voice cloning quality varies by input quality
For support and questions:
- Create an issue on GitHub
- Contact us via Devpost
- Hugging Face for the DialoGPT model
- Google Cloud Platform for infrastructure
- Twilio for messaging capabilities
- The open-source community for various libraries
Built with โค๏ธ for the future of digital identity
Meta-Identity - Where You Meet Your Digital Self ๐
