Stars
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like Grok 2.0.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
🤖 Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, dashboards and BI. 📈📊📋🧑💻
🪄 Create rich visualizations with AI
Solve Visual Understanding with Reinforced VLMs
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
A lightweight Python library for simulating Chinese handwriting
🤗 smolagents: a barebones library for agents that think in python code.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Docker-Compose template for orchestrating a Flask app with a Celery queue using Redis
PoC with FastAPI and Celery to ML inference
This repository contains the example code for my blog article Using Celery with Flask.
智能视频多语言AI配音/翻译工具 - Linly-Dubbing — “AI赋能,语言无界”
[ACL 2024 Best Paper] Deciphering Oracle Bone Language with Diffusion Models
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
a library for named entity recognition developed by UF HOBI NLP lab featuring SOTA algorithms
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A generative speech model for daily dialogue.
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding