Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
51d0a70
commit f073d48
Showing
4 changed files
with
28 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,59 +1,50 @@ | ||
# Eurostat | ||
|
||
The program `eurostat.py` is a simple interface to parse Eurostat data. | ||
Package is a simple interface for parsing data from Eurostat: | ||
|
||
## Executing the modul | ||
* deaths counts | ||
* population sizes | ||
|
||
Parsing data from Eurostat to a file is as easy as | ||
To import and fetch data, simply write | ||
|
||
```bash | ||
python3 eurostat.py --output data.csv --start 2019-01-01 --verbose | ||
```python | ||
import eurostat_deaths | ||
``` | ||
|
||
It downloads the file from Eurostat and parses it according to the input to an output format. | ||
Function `deaths()` fetches the deaths, function `populations()` fetches the populations. Use them such as | ||
|
||
``` | ||
sex,age,geo\time,2020W23,2020W22,2020W21, ... ,2019W03,2019W02,2019W01 | ||
F,OTAL,AT,,,, ... ,852,877,914 | ||
F,OTAL,AT1,,, ... ,364,361,387 | ||
... | ||
``` | ||
## Deaths | ||
|
||
All parameters of the command can be shown with | ||
```python | ||
from datetime import datetime | ||
import eurostat | ||
|
||
```bash | ||
python3 eurostat.py --help | ||
data = eurostat.deaths(start = datetime(2019,1,1)) | ||
``` | ||
|
||
``` | ||
usage: eurostat.py [-h] [-o OUTPUT] [-n CHUNKSIZE] [-s START] [-v] | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
-o OUTPUT, --output OUTPUT | ||
Directs the output to a name of your choice. | ||
-n CHUNKSIZE, --chunksize CHUNKSIZE | ||
Number of lines in chunk (in thousands). | ||
-s START, --start START | ||
Start date. | ||
-v, --verbose Sets verbose log (logging level INFO). | ||
``` | ||
Parameter `start` sets the start of the data. The end is always `now()`. | ||
|
||
## Importing | ||
You receive per-week data of deaths. Since the total size of the data frame is about 218 MB, call taes more than 15 minutes. The usage of memory is significant. | ||
|
||
It can be imported as well. Following code is using the inner function `read_eurostat()` to load the data. The total size of the data frame is about 218 MB, so the call takes more than 15 minutes and the usage of memory is enormous. | ||
In the future, module will be reimplemented to use Big Data framework, such as PySpark. | ||
|
||
The module should not be used like this. Recommended is implementation using Big Data framework, e.g. PySpark. | ||
The data can be forwarded directly to file. Give the function a filename by parameter `output`. | ||
|
||
```python | ||
from datetime import datetime | ||
import eurostat | ||
|
||
data = eurostat.read_eurostat(output = None, start = datetime(2019,1,1)) | ||
data = eurostat.deaths(output = "file.csv", start = datetime(2019,1,1)) | ||
``` | ||
|
||
Parameter `output = None` causes that the output is collected into a single dataframe and returned. | ||
|
||
One additional setting is `chunksize` to set the size of chunk, that is processed at a time. The unit used is thousands of rows. | ||
|
||
## Population | ||
|
||
**TODO** | ||
|
||
## Credits | ||
|
||
Author: [Martin Benes](https://www.github.com/martinbenes1996). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
pandas | ||
requests |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters