# Update a reference atlas using a new query data and share the updated atlas

In this notebook, we re going to demonstrate how to downlod a reference model, add new qeury data to the model
and share the updated model as a new reference atlas

In [3]:
import scarches as sca
import scanpy as sc
sc.settings.set_figure_params(dpi=100, frameon=False, facecolor='white')

Using TensorFlow backend.


`condition_key` is the column name which stores batch id in your `adata.obs`

In [4]:
condition_key = "study"

### load query data

In [5]:
target_conditions = ["Pancreas SS2", "Pancreas CelSeq2"]

In [6]:
adata = sca.datasets.pancreas()
adata = adata[adata.obs[condition_key].isin(target_conditions)]
adata

View of AnnData object with n_obs × n_vars = 5387 × 1000 
    obs: 'batch', 'study', 'cell_type', 'size_factors'

### create scArches network from pre-trained model 

There are some parameters that worth to be mentioned here:

- __path_or_link__: Path to downloaded zip file of scArches' model or a direct downloadable link.
- __prev_task_name__: If `path_or_link` is a link, `prev_task_name` is used for the name of downloaded file.
- __model_path__: path to save downloaded file from link and new trained scArches' model.
- __new_task__: name of the new task to be solved.
- __target_conditions__: list of target conditions (i.e. batch ids) used to append to scArches' `condition_encoder`. These are the batch id for your new query data

In [6]:
link = "https://zenodo.org/record/3930127/files/scNet-pancreas_inDropCelSeqFC1.zip?download=1"

In [9]:
network = sca.create_scArches_from_pretrained_task(
    path_or_link=link,
    prev_task_name='pancreas-inDropCelSeqFC1',
    model_path="./models/scArches/pancreas/",
    new_task="pancreas-CelSeq2,SS2",
    target_conditions=target_conditions,
    version='scArches',
)

File already exists!
scArches's network has been successfully constructed!
scArches' network has been successfully compiled!
scArches' network has been successfully compiled!
cvae's weights has been successfully restored!
scArches's network has been successfully constructed!
scArches' network has been successfully compiled!
scArches' network has been successfully compiled!


### fine-tune pre-trained scArches 

You can train scArches with `train` function with the following parameters:

1. __adata__: Annotated dataset used for training and evaluating scArches.
2. __condition_key__: name of the column in `obs` matrix in `adata` which contains the conditions for each sample.
3. __n_epochs__: number of epochs used to train scArches.
4. __batch_size__: number of sample used to sample as mini-batches in order to optmize scArches. 
8. __save__: whether to save scArches' model and configs after training phase or not. 
9. __retrain__: if `False` and scArches' pretrained model exists in `model_path`, will restore scArches' weights. Otherwise will train and validate scArches on `adata`. 

In [10]:
network.train(adata,
              condition_key=condition_key,
              n_epochs=100,
              batch_size=128, 
              save=True, 
              retrain=True)

Instructions for updating:
Use tf.cast instead.
 |█████████-----------| 48.9%  - loss: 114.5221 - mmd_loss: 0.2376 - reconstruction_loss: 114.2845 - val_loss: 109.0702 - val_mmd_loss: 0.2938 - val_reconstruction_loss: 108.7764
scArches has been successfully saved in ./models/scNet/pancreas/pancreas-CelSeq2,SS2/.


### share your yrained scArches with other researchers using [Zenodo](https://zenodo.org/)

You can easily get TOKEN by signing up in [**Zenodo**](https://zenodo.org/) Website and creating an app in the settings. You just have to following these steps for creating a new TOKEN: 

1. Sign in/Register in [__Zenodo__](https://zenodo.org/)
2. Go to __Applications__ page.
3. Click on __new_token__ in __Personal access tokens__ panel.
4. Give it access for `deposit:actions` and `deposit:write`.

__NOTE__: Zenodo will show the created TOKEN only once so be careful in preserving it. If you lost your TOKEN you have to create new one.

In [11]:
ACCESS_TOKEN = "YOUR_TOKEN"

### 1. Create a Deposition in your zenodo account

You can use wrapper functions in `zenodo` module in scArches package to interact with your depositions and uploaded files in Zenodo. In Zenodo, A deposition is a cloud space for a publication, poster, etc which contains multiple files.

In order to create a deposition in Zenodo, You can call our `create_deposition` function with the following parameters:

-  __access_token__: Your access token
-  __upload_type__: Type of the deposition, has to be one of the following types defined in [here](https://developers.zenodo.org/#representation).
-  __title__: Title of the deposition.
-  __description__: Description of the deposition.
-  __creators__: List of creators of this deposition. Each item in the list has to be in the following form:

```
{
    "name": "LASTNAME, FIRSTNAME", (Has to be in this format)
    "affiliation": "AFFILIATION", (Optional)
    "orcid": "ORCID" (Optional, has to be a valid ORCID)
}
```





In [12]:
deposition_id = sca.zenodo.create_deposition(ACCESS_TOKEN, 
                                             upload_type="other", 
                                             title='scArches-pancreasCelSeq2,SS2',
                                             description='pre-trained scArches on CelSeq2, SmartSeq2',                                            
                                             creators=[
                                                 {"name": "Naghipourfar, Mohsen", "affiliation": "SUT"},
                                             ],
                                             )

New Deposition has been successfully created!


### 2. upload scArches to your deposition

After creating a deposition, you can easily upload your pre-trained scArches model using `upload_model` function in `zenodo` module. This function accepts the following parameters:

- __model__: Instance of scArches' class which is trained on your task
- __deposition_id__: ID of the deposition you want to upload the model in.
- __access_token__: Your TOKEN.

The function will return the generated `download_link` in order to use and provide other 

In [13]:
download_link = sca.zenodo.upload_model(network, 
                                        deposition_id=deposition_id, 
                                        access_token=ACCESS_TOKEN)

Model has been successfully uploaded


In [14]:
download_link

'https://zenodo.org/record/3930132/files/scNet-pancreas-CelSeq2SS2.zip?download=1'

### 3. publish the created deposition

In [15]:
sca.zenodo.publish_deposition(deposition_id, ACCESS_TOKEN)

Deposition with id = 3930132 has been successfully published!


Congrats! Your model is ready to be downloaded by others researchers!