## Using API classes to get data and save to db

# <span style="color:red">clear all output before saving: db output contains passwords! </span>
 
- importing modules needed
- creates a temporary db using .env
- creates the API objects for each vendor/station type
- pulls data from the vendor API


In [None]:
%load_ext autoreload
%autoreload 2

In [None]:

from ewxpwsdb.db.models import WeatherStation, APIResponse, Reading, StationType
from ewxpwsdb.db.importdata import import_station_file, read_station_table
station_file = '../data/test_stations.tsv'

### create new temp database to work with

this notebook does not use the .env file or the get_db_url() function for the app's db, 
instead it creates a new file to be used for a temporary, empty postgresql database, deleted in the end. 
this requires postgresql to be installed on your computer and running.  Any version should work. Tested with versions 15 and 16

Note as of now the functions used assume that the pg server can be accessed without a password from localhost. 

On MacOS, https://postgresapp.com is the very fastest way to get started and works in this way. 

In [None]:
# optional: temporary sqlite URL  - 
# note that sqlite does not support timezone aware datetime fields, so avoid using it
# 
# from tempfile import NamedTemporaryFile
# temp_sqlitefile = NamedTemporaryFile(suffix = '.db')
# temp_db_url = f"sqlite:///{temp_sqlitefile.name}"

In [None]:
# Postgresql temporary URL

# note, if you run this twice without running the associated drop function,
# there will be an orphaned empty temp database you may have to drop manually because the temp db name will be lost
# see the end of this notebook for that function
from ewxpwsdb.db.database import temp_pg_engine
engine = temp_pg_engine(host='localhost')

temp_db_url = engine.url
temp_db_url.database


In [None]:

# use the newly created temporary database

from ewxpwsdb.db.database import Session, init_db
from sqlmodel import select
init_db(engine, station_tsv_file=station_file)



## work with db

This works for just one station type.  Set the type you'd like to work with here.  See `src/ewxpwsdb/weather_apis/__init__.py` file for the list of currently supported station types.   

In [None]:
station_type = 'RAINWISE'

In [None]:
from ewxpwsdb.weather_apis import STATION_TYPE_LIST, API_CLASS_TYPES
APIclass = API_CLASS_TYPES[station_type]
print(APIclass._sampling_interval)

The following function pulls one station of a particular type, the first one it finds

In [None]:
# conventience function get the first station from the database for a specific type
def get_one_station(station_type, engine = engine):
    with Session(engine) as session:
        statement = select(WeatherStation).where(WeatherStation.station_type == station_type)
        results = session.exec(statement)
        weather_station = results.first()

    return(weather_station)

### create station object from database

In [None]:
station = get_one_station(station_type)
station

 *could create weather station object without database, but then there is no ID Field which then can't be used to create related objects*

In [None]:

# this is example code to pull a station from a file of stations, and not a database

# stations = read_station_table(station_file)
# # add code to find the correct one
# station_data = list(filter(lambda x: (x['station_type']==station_type), stations))[0]
# station = WeatherStation.model_validate(station_data) 
# station.station_type


## Test APIs


In [None]:
# create API class from station

from ewxpwsdb.weather_apis import API_CLASS_TYPES
wapi = API_CLASS_TYPES[station.station_type](station)
wapi.station_type

In [None]:
# demo how the weatherapi interface works

print(wapi.station_type)
print(wapi.weather_station.id)
print(wapi.sampling_interval)
print(wapi.APIConfigClass)
# check that configuration class is instantiated with same data in database
api_config = wapi.APIConfigClass.model_validate_json_str(wapi.weather_station.api_config)
print(api_config == wapi.api_config)

### API Requests

In [None]:
# setup the time interval

from datetime import timedelta
from ewxpwsdb.time_intervals import previous_fourteen_minute_interval
interval = previous_fourteen_minute_interval()

# this is 1 hour of data, at least 1 hour ago
s = interval.start  - timedelta(minutes = 60)
e = interval.end  - timedelta(minutes=30)

(s,e)


Get the data from an API request, and save that API_response into the database

In [None]:
# check that the locomos api is working
# this is a special LOCOMOS-only api
if station_type == 'LOCOMOS':
    print(wapi._get_variables())

In [None]:
# get a response and check it
api_response_records = wapi.get_readings(start_datetime=s, end_datetime=e)
from pprint import pprint
pprint(api_response_records[0].response_text)

In [None]:
# confirm there is some data in the response

for response_record in api_response_records:
    if wapi.data_present_in_response(response_record):
        print('data found')

import json
from pprint import pprint
response_data = json.loads(api_response_records[0].response_text)
for element in response_data: 
    print(element)

save the api responses from the request in the database, which then assigns and ID number(s)

In [None]:
session = Session(engine)
for response in api_response_records:
    session.add(response)
    session.commit()

# session is still open

check that the current records that are inside the weatherapi were assigned a database ID

In [None]:

print(wapi.current_api_response_records[0].id)

### Readings

transform/harmonize the response data into sensor values.  

In [None]:
readings = wapi.transform(api_response_records)
readings

In [None]:
for reading in readings:
    try:
        session.add(reading)
        session.commit()
    except Exception as e:
        print(e)



In [None]:
print(readings[0].id)

### Reading Insert procedure

what happens when we pull data that we've already inserted?    test the **upsert** method here

In [None]:
## test the upsert here! 

# attempt to re-insert a reading but update instead e.g. upsert   pg_upsert_stmt(model, insert_data, update_data=None, index_elements=None):
repeat_readings = wapi.transform(api_response_records)
print(repeat_readings[0])
from ewxpwsdb.db.database import pg_upsert_stmt
stmt = pg_upsert_stmt(Reading, repeat_readings[0].model_dump())
print(stmt)


## could we make the above form look more professional?

current it takes two function calls to get a response and transform into reading records. 

Could we do something that looks like this?

```readings = wapi.get_readings(s,e).transform()```

that looks nice, but requires changing the code  to return self from get_readings 

*(which may be possible if the outputs from these functions are saved as state in the class)*

We could also just create a convenience functions to accomplish the same task: 

```readings = wapi.get_and_transform_readings(s,e)  ```


but we also want to save the responses in a database and link the reading records with the response for auditing to discover mistakes, and this does not all for that, 
because the readings don't hvae the response database id to create a link with until they are saved in the DB

The solution is to use the `Collector` class that has the convenience wrappers and combines API with database functions. 

see [`example using collector`](example_using_collector.ipynb) notebook


check that these data can be turned into a Reading object (data + metadata )

save the rows of data from the sensor into the database using a Session

In [None]:
# readings from transform...
for reading in readings:
    session.add(reading)    
session.commit()
# does it have an id now?
readings[0].id    

In [None]:
readings[0].id 

## Getting data back from the database

In [None]:
# check we still have a station id. 
station.id


In [None]:
session.close()

In [None]:
# summarize readings in the database
station_id = station.id
with Session(engine) as session:
    stmt = select(Reading, WeatherStation).join(WeatherStation).where(WeatherStation.id  == station.id)
    # results = session.exec(stmt)
    reading_records = session.exec(stmt).all()
    for reading_records in reading_records:
        reading = reading_records.Reading
        print(f"{reading.data_datetime}: air temp {reading.atmp}C")

    # let's save one for later
    reading = readings[0]
    session.close()
    



In [None]:
# get more data
# note that the 'session' must be present for the whole transaction of response and readings for
responses = wapi.get_readings()
session = Session(engine)
for response in responses:
    session.add(response)
    session.commit()

responses[0].id
    
responses[0]

In [None]:
responses[0].id

In [None]:
stmt = select(APIResponse).where(APIResponse.request_id == wapi.current_api_response_records[0].request_id)
result = session.exec(stmt)
some_apiresponse = result.first()
print(some_apiresponse.id)

In [None]:
api_response_records = api_response_records or wapi.current_api_response_records
api_response_records[0].id

## Clean up 

If using databases, remove test databases

In [None]:
# if sqlite
if 'sqlite' in temp_db_url and temp_sqlitefile:
    session.close()
    engine.dispose()
    temp_sqlitefile.close()

In [None]:
# if postgresl (the drop function detects if it's a postgres engine)


from ewxpwsdb.db.database import drop_temp_pg_engine, list_pg_databases, drop_temp_pg_db
from sqlalchemy.orm import close_all_sessions

print(f"attempting to drop db {engine.url.database}")
close_all_sessions()
engine.dispose()
result = drop_temp_pg_engine(engine)
print(result)

engine.dispose()
list_pg_databases(host='localhost')
