In [2]:
import lumipy as lm

In [9]:
# Get your atlas here as before
atlas = lm.get_atlas(api_secrets_filename='secrets.json')

Getting atlas🌎
  • Querying data provider metadata...
  • Querying direct provider metadata...
  • Building atlas...
Done!
Contents: 
  • 273 data providers
  • 18 direct providers


# Query Scripting II: CSVs, Table Variables and Window Functions

Take the `data/aapl_tsla.csv` file in the repo and put it into drive. Save down the full path of the file in drive for later - we're going to use it as a data source in this notebook. 

## CSV File in Drive
### Background

You can read files from drive and treat them like any other provider
```
    csv = atlas.drive_csv(file='/path/in/drive/to/your.csv')
```

Once initialised you can chain on `.select()` etc. to build your query. 

### Exercise
Initialised a provider object for your `aapl_tsla.csv` CSV and select the first 10 rows from it.

In [10]:
csv = atlas.drive_csv(file='honeycomb/testing/aapl_tsla.csv')

In [11]:
csv.select('*').limit(10).go()

Unnamed: 0,Date,aapl,tsla
0,2011-03-17,10.233702,4.562
1,2011-03-18,10.112293,4.592
2,2011-03-21,10.376211,4.546
3,2011-03-22,10.434312,4.438
4,2011-03-23,10.372846,4.442
5,2011-03-24,10.549605,4.466
6,2011-03-25,10.750521,4.55
7,2011-03-28,10.716885,4.65
8,2011-03-29,10.732789,4.784
9,2011-03-30,10.661532,4.742


## Table Variables and Column Functions

### Background
The results of a subquery may be stored in a table variable and re-used later on in a Luminesce query. You can do this in lumipy by calling `.to_table_var()` in your query script. 

Functions can be applied to columns by calling the corresponding method on the column object. Some of these method are organised with accessor attributes much like pandas' `df['col'].str` and `df['col'].dt` etc.

### Exercise

Create a table variable from data derived from the CSV. The table variable should contain log returns data for AAPL and TSLA in separate columns. Name these new columns `aapl_log_rets` and `tsla_log_rets`.

Log returns would be calculated for AAPL like this
```
    (csv.aapl.finance.prices_to_returns() + 1).log()
```

In [12]:
tv = csv.select(
    '*',
    aapl_log_rets=(csv.aapl.finance.prices_to_returns() + 1).log(),
    tsla_log_rets=(csv.tsla.finance.prices_to_returns() + 1).log(),    
).to_table_var()

In [13]:
tv.select('*').limit(10).go()

Unnamed: 0,Date,aapl,tsla,aapl_log_rets,tsla_log_rets
0,2011-03-17,10.233702,4.562,,
1,2011-03-18,10.112293,4.592,-0.011935,0.006555
2,2011-03-21,10.376211,4.546,0.025764,-0.010068
3,2011-03-22,10.434312,4.438,0.005584,-0.024044
4,2011-03-23,10.372846,4.442,-0.005908,0.000901
5,2011-03-24,10.549605,4.466,0.016897,0.005388
6,2011-03-25,10.750521,4.55,0.018866,0.018634
7,2011-03-28,10.716885,4.65,-0.003134,0.02174
8,2011-03-29,10.732789,4.784,0.001483,0.02841
9,2011-03-30,10.661532,4.742,-0.006661,-0.008818


## Windows and Window Functions

### Background

Windows allow us to create sliding frames and partions of our data and apply functions to them. In this notebook we'll use a window to compute statistics in a 90-day sliding window. In lumipy these windows are created using a top-level function in the module: `lumipy.window`

A sliding window with a fixed number of rows before / after is specified by
```
    sliding = lm.window(lower=n_before, upper=n_after)
```

An expanding window in either direction is specified by setting one of the `lower`/`upper` to `None`
```
    # Expanding window from the start to current row
    expanding = lm.window(lower=None, upper=0)
```

Functions are applied to windows in a similar way to columns
```
    window.finance.max_drawdown(table.prices)
```

### Exercise

Create a window that covers 90 days (rows) up to now and build columns that apply some statistics in this window. Add these columns in a `.select` call on tv and send the query off with `.go()`

In [16]:
w90 = lm.window(lower=90)

In [18]:
query = tv.select(
    '*',
    aapl_max_dd=w90.finance.max_drawdown(tv.aapl),
    tsla_max_dd=w90.finance.max_drawdown(tv.tsla),    
    alpha_90d=w90.linreg.alpha(tv.aapl, tv.tsla),
    beta_90d=w90.linreg.beta(tv.aapl, tv.tsla),    
)

In [19]:
query.go()

Unnamed: 0,Date,aapl,tsla,aapl_log_rets,tsla_log_rets,aapl_max_dd,tsla_max_dd,alpha_90d,beta_90d
0,2011-03-17,10.233702,4.562000,,,0.000000,0.000000,,
1,2011-03-18,10.112293,4.592000,-0.011935,0.006555,0.011864,0.000000,7.090763,-0.247101
2,2011-03-21,10.376211,4.546000,0.025764,-0.010068,0.011864,0.010017,6.333350,-0.172515
3,2011-03-22,10.434312,4.438000,0.005584,-0.024044,0.011864,0.033537,8.568366,-0.392051
4,2011-03-23,10.372846,4.442000,-0.005908,0.000901,0.011864,0.033537,9.156790,-0.450305
...,...,...,...,...,...,...,...,...,...
2770,2022-03-18,163.979996,905.390015,0.020703,0.038035,0.171409,0.363183,647.568308,1.953147
2771,2022-03-21,165.380005,921.159973,0.008501,0.017268,0.171409,0.363183,548.250854,2.530922
2772,2022-03-22,168.820007,993.979980,0.020587,0.076083,0.171409,0.363183,505.091067,2.784488
2773,2022-03-23,170.210007,999.109985,0.008200,0.005148,0.171409,0.363183,417.335822,3.300755
