<center>
# How to use DataFrames and Series in the brightwind library
</center>

In [None]:
# This cell prints today's date for reference. This comment can be deleted.
import datetime
print('Last updated: {}'.format(datetime.date.today().strftime('%d %B, %Y')))

***
## Outline:

Pandas DataFrames (two-dimensional) and Series (one-dimensional) data structures are extensively used in the brightwind library for the storage, transfer and display of data. Both are tabular data structure and are explained in more detail here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

This tutorial will give a basic introduction to usng these data structures and outline how to:

1. Seperate specific columns from DataFrames into new DataFrames and Series
1. Select specific ranges from DataFrames and Series using the index
1. Search for a specific entry in a DataFrame or Series

***

### Step 1: Seperateing Columns and Rows from DataFrames and Series
Data can be read into DataFrames easily from excel and csv files easily using the brightwind functions <em>load_csv</em> and <em>load_excel</em>. In the example below, data is read from the csv file <em> demo_data</em> into the DataFrame <em> data </em>.

In [2]:
import brightwind as bw
data = bw.load_csv(r'C:\Users\lukec\demo_data.csv')
data

OSError: 'C:\\Users\\lukec\\Anaconda3\\lib\\site-packages\\brightwind\\analyse\\bw.mplstyle' not found in the style library and input is not a valid URL or path; see `style.available` for list of available styles

Once this data is loaded, the different columns and rows in the dataframe can be isolated for use in other calculations. To isolate the first column, <em>Spd80mN</em>, from the DataFrame <em>data</em> into the Series <em> Wspd80mN</em> the command is:


In [None]:
Wspd80mN = data['Spd80mN']

The series <em>Wspd80mN</em> can then be easily passed into a function such as <em>monthly_means</em>:

In [None]:
bw.monthly_means(Wspd80mN)

### Step 2: Selecting Ranges
1. Ranges from within a Series or DataFrame can then be seleced. For example, to select all data points from the start of the series <em>Wspd80mN</em>  to a certain date, i.e. 2016-12-31:

In [None]:
Wspd80mN2016 = Wspd80mN[:'2016-12-31']

2. To select all the data points from a specific date, i.e. 2017-01-01, to the end of the series:

In [None]:
Wspd80mN_less_2016 = Wspd80mN['2017-01-01':]

3. To select between two dates, i.e. 2017-01-01 to 2017-12-31:

In [None]:
Wspd80mN2017 = Wspd80mN['2017-01-01':'2017-12-31']

4. These operation can also be performed on the original DataFrame:

In [None]:
Wspd80mN2017 = data['Spd80mN']['2017-01-01':'2017-12-31']

5. These ranges can then too be used within functions, such as <em>monthly_means</em>

In [None]:
bw.monthly_means(Wspd80mN2017)

***

### Step 3: Selecting Specific Entries

Specific entries in DataFrames and Series can be accessed both by their index and columns name, or by their position in the DataFrame or Series.

1. To select a specific entry by its columns name and index, i.e. the entry in the column <em> Spd80mN </em> at timestamp <em>2016-01-09 17:00:00</em>, type:


In [None]:
data['Spd80mN']['2016-01-09 17:00:00']

2. To select a specific entry by its position in the DataFrame, i.e. the 3rd entry in the 1st column, use <em>.iloc</em>. When using Pandas, indexing starts at 0 for both columns and rows of DataFrames and Series. The index for the the 3rd entry in the 1st column would therefore be  [2,0] :
    

In [None]:
data.iloc[2,0]

***
This tutorial can be downloaded as a Jupyter Notebook from the following link:
<br>
https://github.com/brightwind-dev/brightwind/tree/master/docs/source/tutorials/how_to_use_dataframes_and_series.ipynb

***