LlamaIndex is a data framework for your LLM applications
-
Updated
Nov 8, 2024 - Python
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects.
(https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
LlamaIndex is a data framework for your LLM applications
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
A synthetic data generator for text recognition
🧙 Build, run, and manage data pipelines for integrating and transforming data.
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Extract data from a wide range of Internet sources into a pandas DataFrame.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Superduper: build end-2-end AI applications and templates using your existing data infrastructure and tools of choice
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)