from yeastdnnexplorer.interface import *

# The Database Interface Classes

For each API endpoint exposed in the Django app, there is a corresponding class that
provide methods to execute CRUD operations asynchronously.

There are two types of API endpoints -- those that contain only 'records' data, and 
those that store pointers to files.

In [23]:
from yeastdnnexplorer.interface import *

## Records Only Endpoints

The records only endpoints are:

- BindingManualQC

- DataSource

- ExpressionManualQC

- FileFormat

- GenomicFeature

- PromoterSetSig

- Regulator

When the `read()` method is called on the corresponding API classes, a dataframe will
be returned in the response.

All of the `read()` methods, for both types of API endpoints, return the result of
a callable. By default, the callable returns a dictionary with two keys: `metadata` and
`data`. For response only tables, the `metadata` value will be the records from the
database as a pandas dataframe and the `data` will be `None.

### Example -- RegulatorAPI

In [24]:
regulator = RegulatorAPI()

result = await regulator.read()
result.get("metadata")



Unnamed: 0,id,uploader_id,upload_date,modifier_id,modified_date,genomicfeature_id,under_development,notes,regulator_locus_tag,regulator_symbol
0,6,1,2024-03-18,1,2024-03-18 18:33:43.005782+00:00,17,False,none,YAL059W,ECM1
1,7,1,2024-03-18,1,2024-03-18 18:33:44.106488+00:00,20,False,none,YAL056W,GPB2
2,8,1,2024-03-18,1,2024-03-18 18:33:44.605193+00:00,21,False,none,YAL055W,PEX22
3,9,1,2024-03-18,1,2024-03-18 18:33:44.891194+00:00,22,False,none,YAL054C,ACS1
4,10,1,2024-03-18,1,2024-03-18 18:33:45.111148+00:00,23,False,none,YAL053W,FLC2
...,...,...,...,...,...,...,...,...,...,...
1810,1816,1,2024-03-18,1,2024-03-18 23:13:22.054254+00:00,5258,False,none,YMR168C,CEP3
1811,1817,1,2024-03-18,1,2024-03-18 23:41:58.734300+00:00,5878,False,none,YNR054C,ESF2
1812,1818,1,2024-03-25,1,2024-03-25 20:04:21.181017+00:00,569,False,none,YBR267W,REI1
1813,1819,1,2024-03-25,1,2024-03-25 20:04:32.036007+00:00,1171,False,none,YDR081C,PDC2


## Record and File Endpoints

The record and file endpoints are the following:

- CallingCardsBackground

- Expression

- PromoterSet

- PromoterSetSig

- RankResponse *

The default `read()` method is the same as the Records only Endpoint API classes.
However, there is an additional argument, `retrieve_files` which if set to `True`
will retrieve the file for which each record provides metadata. The return value of
`read()` is again a callable, and by default the `data` key will store a dictionary
where the keys correspond to the `id` column in the `metadata`.

\* RankResponse is not yet implemented, but follows a similar pattern to the 'Record
and File' set


In [25]:
# records only example
pss_api = PromoterSetSigAPI()
result = await pss_api.read()
result.get("metadata")



Unnamed: 0,id,uploader_id,upload_date,modifier_id,modified_date,binding_id,promoter_id,background_id,fileformat_id,file
0,8419,1,2024-03-25,1,2024-03-25 20:04:11.646870+00:00,3011,,,1,promotersetsig/8419.csv.gz
1,8420,1,2024-03-25,1,2024-03-25 20:04:12.153311+00:00,3012,,,1,promotersetsig/8420.csv.gz
2,8421,1,2024-03-25,1,2024-03-25 20:04:12.508328+00:00,3013,,,1,promotersetsig/8421.csv.gz
3,8422,1,2024-03-25,1,2024-03-25 20:04:12.835947+00:00,3014,,,1,promotersetsig/8422.csv.gz
4,8423,1,2024-03-25,1,2024-03-25 20:04:13.162373+00:00,3015,,,1,promotersetsig/8423.csv.gz
...,...,...,...,...,...,...,...,...,...,...
2178,11091,1,2024-03-26,1,2024-03-26 14:30:28.156704+00:00,4481,4.0,6.0,5,promotersetsig/11091.csv.gz
2179,11092,1,2024-03-26,1,2024-03-26 14:30:28.218468+00:00,4480,4.0,6.0,5,promotersetsig/11092.csv.gz
2180,11093,1,2024-03-26,1,2024-03-26 14:30:28.310173+00:00,4482,4.0,6.0,5,promotersetsig/11093.csv.gz
2181,11094,1,2024-03-26,1,2024-03-26 14:30:28.695849+00:00,4483,4.0,6.0,5,promotersetsig/11094.csv.gz


## Filtering

All API classes have a `params` attribute which stores the filtering parameters
which will be applied to the HTTP requests.

In [26]:
pss_api.push_params({"regulator_symbol": "HAP5",
                     "workflow": "nf_core_callingcards_dev",
                     "data_usable": "pass"})

result = await pss_api.read(retrieve_files = True)

In [27]:
result.get("metadata")

Unnamed: 0,id,uploader_id,upload_date,modifier_id,modified_date,binding_id,promoter_id,background_id,fileformat_id,file
0,10690,1,2024-03-26,1,2024-03-26 14:28:43.825628+00:00,4079,4,6,5,promotersetsig/10690.csv.gz
1,10694,1,2024-03-26,1,2024-03-26 14:28:44.739775+00:00,4083,4,6,5,promotersetsig/10694.csv.gz
2,10754,1,2024-03-26,1,2024-03-26 14:29:01.837335+00:00,4143,4,6,5,promotersetsig/10754.csv.gz
3,10929,1,2024-03-26,1,2024-03-26 14:29:45.379790+00:00,4318,4,6,5,promotersetsig/10929.csv.gz
4,10939,1,2024-03-26,1,2024-03-26 14:29:47.853980+00:00,4327,4,6,5,promotersetsig/10939.csv.gz


In [28]:
result.get("data").get("10690")

Unnamed: 0,name,chr,start,end,strand,experiment_hops,background_hops,background_total_hops,experiment_total_hops,callingcards_enrichment,poisson_pval,hypergeometric_pval
0,1,chrI,0,335,+,0,2,103922,8579,0.0,0.305876,1.0
1,2,chrI,0,538,+,0,4,103922,8579,0.0,0.411518,1.0
2,3,chrI,2169,2480,-,0,1,103922,8579,0.0,0.246143,1.0
3,4,chrI,2169,2480,+,0,1,103922,8579,0.0,0.246143,1.0
4,5,chrI,9017,9717,-,0,11,103922,8579,0.0,0.669806,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...
6703,7118,chrM,66174,66874,-,0,0,103922,8579,0.0,0.181269,1.0
6704,7132,chrM,74513,75213,-,0,0,103922,8579,0.0,0.181269,1.0
6705,7133,chrM,75984,76684,-,0,0,103922,8579,0.0,0.181269,1.0
6706,7137,chrM,80022,80722,-,0,0,103922,8579,0.0,0.181269,1.0


Parameters can be removed one by one

In [29]:
print(pss_api.params)

regulator_symbol: HAP5, workflow: nf_core_callingcards_dev, data_usable: pass


In [30]:
pss_api.pop_params('data_usable')

print(pss_api.params)

regulator_symbol: HAP5, workflow: nf_core_callingcards_dev


or cleared entirely

In [31]:
pss_api.pop_params(None)

pss_api.params == {}

True

## Caveats

The abstract classes are tested, but I haven't yet tested all of the concrete
implementations. RankResponse has been implemented in R, but I haven't translated 
it to Python yet -- that is imminent.

The point of setting these up as async calls is to facilitate their use in a frontend
of some sort (currently going to use python shiny)