A curated collection of public datasets organized by category for data science and machine learning projects.
- iris.csv - Fisher's Iris dataset with measurements of iris flowers
- titanic.csv - Titanic passenger survival data
- wine-quality.csv - Wine quality measurements and ratings
- diabetes.csv - Diabetes patient data with health indicators
- currency-codes.csv - ISO currency codes and information
- s-and-p-500.csv - S&P 500 index historical data
- stock-prices-apple.csv - Apple stock prices historical data
- airport-codes.csv - Global airport codes and locations
- country-codes.csv - ISO country codes and information
- us-cities.csv - US cities with population and coordinates
- taxi_zone_lookup.csv - NYC taxi zone reference data
- yellow_tripdata_2025-01.parquet - NYC yellow taxi trip data
- heart-disease.csv - Heart disease patient data for prediction
- breast-cancer.csv - Breast cancer diagnostic measurements
- online-retail.csv - Online retail transaction data
- customer-segmentation.csv - Customer demographic and behavior data
- student-performance.csv - Student academic performance and demographics
- nba-players.csv - NBA player statistics and salaries
- world-cup-2014.csv - 2014 FIFA World Cup team statistics
All datasets are provided in CSV format (except where noted) and can be downloaded directly or accessed via GitHub raw URLs.
Contributions of additional public datasets are welcome. Please ensure datasets are properly formatted and include appropriate documentation.