# Pyaro basic example

* Install pyaro and check if installation is new enough:

In [18]:
import pyaro
pyaro.__version__

'0.0.5'

* Check a list of installed engines. The most basic installation will install only the `csv_timeseries` engine.
Install e.g. `https://github.com/metno/pyaro-readers` for many more engines.

In [19]:
pyaro.list_timeseries_engines()

{'csv_timeseries': <pyaro.csvreader.CSVTimeseriesReader.CSVTimeseriesEngine at 0x7ff77705f250>}

* Learn a bit about the engine.

In [20]:
pr_csv = pyaro.list_timeseries_engines()['csv_timeseries']
help(pr_csv)

Help on CSVTimeseriesEngine in module pyaro.csvreader.CSVTimeseriesReader object:

class CSVTimeseriesEngine(pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine)
 |  Method resolution order:
 |      CSVTimeseriesEngine
 |      pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine
 |      pyaro.timeseries.Engine.Engine
 |      abc.ABC
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  description(self)
 |      Get a descriptive string about this pyaro implementation.
 |  
 |  open(self, filename, *args, **kwargs) -> pyaro.csvreader.CSVTimeseriesReader.CSVTimeseriesReader
 |      open-function of the timeseries, initializing the reader-object, i.e.
 |      equivalent to Reader(filename_or_object_or_url, *, filters)
 |      
 |      :return pyaro.timeseries.Reader
 |      :raises UnknownFilterException
 |  
 |  reader_class(self)
 |      return the class of the corresponding reader
 |      
 |      :return: the class returned from open
 |  
 |  url(self)
 |      Get a

* Check the description and the open-arguments to open a database with this engine:

In [21]:
print(pr_csv.description())
print(pr_csv.args())

Simple reader of csv-files using python csv-reader
('filename', 'columns', 'variable_units', 'country_lookup', 'csvreader_kwargs', 'filters')


## Opening a datasource with an engine

Open now the timeseries `ts` with a table. You could do that with a `with` clause in larger code, 
but for simplicity, we don't do that here. `columns` map the files columns to the data, starting
with first column as 0, which contains the variable-name in our example file.

The test-file is read using the python `csv` module. `csvreader_kwargs` sets up that module, i.e.
comma-separated setting the delimiter.

In [22]:
file = "../../tests/testdata/csvReader_testdata.csv"
columns = {
            "variable": 0,
            "station": 1,
            "longitude": 2,
            "latitude": 3,
            "value": 4,
            "units": 5,
            "start_time": 6,
            "end_time": 7,
            "altitude": "0",
            "country": "NO",
            "standard_deviation": "NaN",
            "flag": "0",
        }
csvreader_kwargs = {"delimiter": ","}

ts = pyaro.open_timeseries('csv_timeseries',
                           filename=file,
                           columns=columns,
                           csvreader_kwargs=csvreader_kwargs,
                           filters=[])


`ts` is now the handle to the data-source.

* Accessing metadata in the datasource, like available variables and stations

In [23]:
print(ts.variables())
print(ts.stations())

dict_keys(['SOx', 'NOx'])
{'station1': <pyaro.timeseries.Station.Station object at 0x7ff776cc9d20>, 'station2': <pyaro.timeseries.Station.Station object at 0x7ff776cca6e0>}


* The timeseries must be accessed per variable. It will be returned for all
stations. The data-columns can be accessed by `keys()`:

In [24]:
var = 'SOx'
ts_data = ts.data(var)
print(ts_data.keys())
ts_data.stations
ts_data.start_times
ts_data.end_times
ts_data.latitudes
ts_data.longitudes
ts_data.altitudes
ts_data.flags
ts_data.values


('values', 'stations', 'latitudes', 'longitudes', 'altitudes', 'start_times', 'end_times', 'flags', 'standard_deviations')


array([44.377964 , 73.23672  , 66.83997  , 75.973015 , 54.252964 ,
       95.51215  , 43.424374 , 14.8503275, 39.78734  , 84.14651  ,
        2.3796806, 56.030033 , 90.70785  , 53.49256  , 33.27008  ,
       19.200666 , 16.61291  , 95.239876 , 58.38857  , 25.010443 ,
       49.31731  , 95.74444  , 35.146294 , 31.468204 , 70.109985 ,
       46.82392  , 44.06993  , 15.679094 , 54.04226  , 42.6484   ,
       21.370073 , 37.34375  , 14.086469 , 31.23552  , 12.328813 ,
       85.39133  , 96.85262  , 68.06294  , 67.1648   , 27.18295  ,
       28.523333 ,  1.4397316, 74.56935  , 50.91362  , 34.764988 ,
        4.5323606, 29.767143 , 16.157143 , 61.595753 , 57.319874 ,
       63.740353 ,  4.939785 ,  5.5386314, 73.256615 , 18.165173 ,
       96.29508  , 20.86049  , 60.049885 , 36.644806 , 70.943375 ,
        9.295645 ,  1.7138128, 56.983192 , 89.55616  , 13.375153 ,
       49.939552 , 31.528936 , 78.00686  , 28.33076  , 16.8259   ,
       73.02892  , 96.075714 , 19.514969 , 68.14331  , 21.9664

## Conversion to pandas

For pandas users, the timeseries data can be converted to a dataframe:

In [25]:
pyaro.timeseries_data_to_pd(ts_data)

Unnamed: 0,values,stations,latitudes,longitudes,altitudes,start_times,end_times,flags,standard_deviations
0,44.377964,station1,10.5,172.500000,0.0,1997-01-01,1997-01-02,0,
1,73.236717,station1,10.5,172.500000,0.0,1997-01-02,1997-01-03,0,
2,66.839973,station1,10.5,172.500000,0.0,1997-01-03,1997-01-04,0,
3,75.973015,station1,10.5,172.500000,0.0,1997-01-04,1997-01-05,0,
4,54.252964,station1,10.5,172.500000,0.0,1997-01-05,1997-01-06,0,
...,...,...,...,...,...,...,...,...,...
99,85.183685,station2,45.5,-103.199997,0.0,1997-02-17,1997-02-18,0,
100,93.348305,station2,45.5,-103.199997,0.0,1997-02-18,1997-02-19,0,
101,97.579193,station2,45.5,-103.199997,0.0,1997-02-19,1997-02-20,0,
102,19.217777,station2,45.5,-103.199997,0.0,1997-02-20,1997-02-21,0,
