# POST scoreset

## create a scoreset in MaveDB via the API

Before we begin, you will need to put the following files in directories that we have created for you. We have created the directories for you, and have test files in that those locations. However, you can store these files where you like, just make sure you adjust the path to the test file directory in later in this 

abstract.md
count.csv
fasta_file.fasta
metadata.json
score_data.csv

Be sure to organize the content of these files before beginning.

To begin, import the modeules below.

In [None]:
import attr, os
from mavetools.client.client import Client
from mavetools.models.licence import Licence
from mavetools.models.scoreset import NewScoreSet, NewScoreSetRequest, ScoreSet
from mavetools.models.target import NewTarget, ReferenceGenome, ReferenceMap, SequenceOffset

You will want to check your environment variables on your computer and see if the variable named MAVEDB_BASE_URL exists. If it exists, base_url will be set to the value of MAVEDB_BASE_URL. To create and set a new evironment variable, follow these steps:

1. 
2.
.
.
. 

If the environment variable MAVEDB_BASE_URL does not exist, an empty string is returned instead, which will inturn set base_url to localhost (http://127.0.0.1:8000/api/). This default funcionality is what you would want to use when working with a local instance of MaveDB (e.g., a development branch).

In [None]:
base_url = os.getenv('MAVEDB_BASE_URL', '')

Set the name value of scoreset_urn to the value of the scoreset and set the value of experiment_urn to the urn of the experiment where the scoreset belongs.

In [None]:
scoreset_urn = 'urn:mavedb:00000001-a-1'
experiment_urn = 'urn:mavedb:00000001-a'

Next, you will need an auth_token to make POST requests to MaveDB. If you have one, substitute it in the example provided below. If you need one, please follow these instructions:

1.
2.
3.
.
.
.

The auth_token is generated and set in your user account page within MaveDB. Ensure that the token fits the format of the example token format below.

In [None]:
# Generate a new auth_token in your profile and post it here
auth_token = 'AseyaNLLhqv9jAm0joMkq2oqB0bw3GKxTclkT2NtG340RF6CfdM2UC3j8Fv4RpbQ'

Here you instantiate the client object. The Client object is the object by which the POST request is performed. As mentioned in previously, if the MAVEDB_BASE_URL exists, the client object is instantiated with that value. Otherwise the client object is instantiated with default value, which points to localhost (http://127.0.0.1:8000/api/).

In [None]:
client = Client(base_url, auth_token=auth_token) if base_url else Client(auth_token=auth_token)

test_file_dir is the path to the directory in which the files needed for making a scoreset POST resquest exist. The required files are as follows:

1.
2.
.
.
.

We have an example directory in mavetools. Though this directory can exist anywhere on your computer, you must put the correct path to that directory as the value to test_file_dir.

In [None]:
# Change this dir string as needed. It's currently configured for running
# inside a Docker container that mounts the home directory as a volume.
test_file_dir = '/mavetools/tests/test_upload_scoreset/test_files'

Instantiate the NewScoreSet object and assign it to the new_scoreset veriable. You must substitute the attribute values for your scoreset's attribute values.

In [None]:
new_scoreset = NewScoreSet(
    title='test_title',
    short_description='test_short_description',
    abstract_text='test_abstract_text',

    experiment=experiment_urn,
    score_data=f"{test_file_dir}/test_score_data.csv",
    count_data=f"{test_file_dir}/test_count.csv",
    meta_data=f"{test_file_dir}/test_metadata.json",
    licence=Licence(short_name='CC BY 4.0'),

    sra_ids=['SRP109119'],
    pubmed_ids=['23035249'],
    doi_ids=['10.1038/s41467-019-11526-w'],
)

Instantiate the NewScoresetRequest object and assign it to the new_scoreset_request. You must substitute the attribute values for your own.

In [None]:
new_scoreset_request = NewScoreSetRequest(
    scoreset=new_scoreset,
    target=NewTarget(
        name='test_target_name',
        type='Protein coding',
        sequence_type='Infer',
        fasta_file=f"{test_file_dir}/test_fasta_file.fasta"
    ),
    uniprot=SequenceOffset(offset=1, identifier='P63165'),
    ensembl=SequenceOffset(offset=1, identifier='ENSG00000116030'),
    refseq=SequenceOffset(offset=1, identifier='NM_001005781.1'),
    reference_maps=[
        ReferenceMap(genome=ReferenceGenome(short_name='hg16'))
    ]
)

In [None]:
client.post_model_instance(new_scoreset_request)