Skip to content

UNECE/ML_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

ML_dataset

Open datasets from Machine Learning Project

Statistics Poland ECOICOP Dataset

  • Description: Excel sheet of the names of 17,099 products that were web scraped and anonymized from three online shops. The products were classified to European Classification of Individual Consumption according to Purpose (ECOICOP) categories manually by non-experts. Original dataset was in Polish and was translated to different languages provided in different sheets. The translation was carried out using an on-line translator and was not reviewed for accuracy or appropriateness.
  • Source: Statistics Poland
  • Preview - Sheet "Polish" (header and first three rows)
produkt kategoria
Hejki - Emotki lizaki ręcznie robione o smakach owocowych Wyroby cukiernicze
100% Pur jus d orange sok pomarańczowy z miąższe... Soki owocowe i warzywne
100% sukraloza bez cukru (substancje słodzące) Sztuczne substytuty cukru
  • Preview - Sheet "English" (header and first three rows)
produkt kategoria
Hejki - Emotes handmade lollipops with fruit flavors Confectionery products
100% Pur jus d orange orange juice with pulp ... Fruit and vegetable juices
100% sucralose without sugar (sweeteners) Artificial sugar substitutes
  • Example of reading the dataset in python
df = pd.read_excel('https://raw.githubusercontent.com/UNECE/ML_dataset/master/Stats%20Poland%20ECOICOP%20data.xlsx', sheet_name = 'Polish')

You can choose data in: Dutch, English, French, German, Italian, Polish or Spanish, by changing the value of the parameter sheet_name.

Belgium VITO Energy Balance Dataset

  • Description: Excel sheet with quarterly electricity supplied (e.g. combustible, hyrdo, nuclear), economic indicator (e.g. GDP, GVA) and other variables (e.g. population, sunspots) from 2000 Q1 to 2019 Q1.
  • Source: Buelens, Bart, & Goyens, Anneleen. (2020). Energy Balance Flanders quarterly and monthly data and related auxiliary data (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3596695
  • Preview (header and first three rows)
Variable Full name 2000Q1 2000Q2 2000Q3 ... 2019Q1
EnrgCombustibleFuels + Combustible Fuels GWh 10108 7739 6984 ... 8681.063
EnrgNuclearNuclear + Nuclear GWh 11087 10770 11154 ... 9304.936
EnrgHydroHydro + Hydro GWh 446 357 397 ... 346.173
  • Example of reading the dataset in python
df = pd.read_excel('https://zenodo.org/record/3596695/files/VITO_EnergyBalanceDataML.xlsx', sheet_name = 'quarterly_txt')

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published