# Introduction

This is a simplification of my figuring out how to submit seqspec files to the IGVF DACC.

- [Python environment](#Setup)
- [Seqspec template function](#Template)
- [Working out steps needed to create seqspec objects](#Working-out-steps-needed-to-create-seqspec-objects)
- [Exploring boto3](#Exploring-boto3)
- [Seqspec submission functions](#Seqspec-Submission-functions)
- [Create seqspec objects for remaining fastqs](#Create-seqspec-objects-for-remaining-fastqs)

# Setup

First we start with general imports

In [1]:
import hashlib
import requests
from pathlib import Path
from io import StringIO, BytesIO
import sys
import json
from jsonschema import Draft4Validator
import pandas
import os
import sys
from urllib.parse import urlparse

import boto3
from botocore.exceptions import ClientError

I want to be able to use the seqspec validator while I was writing my seqspec file.

I have the repository checked out into ~/proj/seqspec. This block should either import it for me, or install it if someone elese runs it.

In [2]:
try:
    import seqspec
except ImportError:
    seqspec_root = Path("~/proj/seqspec").expanduser()
    if seqspec_root.exists() and str(seqspec_root) not in sys.path:
        sys.path.append(str(seqspec_root))
    else:
        !{sys.executable} -m pip install --user seqspec

Import pieces of seqspec that we need for this notebook.

In [3]:
from seqspec.Assay import Assay
from seqspec.Region import Region
from seqspec.Region import Onlist
from seqspec.utils import load_spec_stream
import yaml

I have my own API for interacting with the IGVF database server (which is very much like the old ENCODE database server)

In [4]:
try:
    from encoded_client import encoded
except ImportError:
    encoded_root = Path("~/proj/encoded_client").expanduser()
    if encoded_root.exists() and str(encoded_root) not in sys.path:
        sys.path.append(str(encoded_root))
    else:
        !{sys.executable} -m pip install --user encoded_client
        
    from encoded_client import encoded

In [5]:
server_name = "api.sandbox.igvf.org"
server = encoded.ENCODED(server_name)
igvf_validator = encoded.DCCValidator(server)

In [6]:
award = "/awards/HG012077/"
lab = "/labs/ali-mortazavi/"


In [7]:
def seqspec_validate(schema, spec):
    """Validate a yaml object against a json schema
    """
    validator = Draft4Validator(schema)

    for idx, error in enumerate(validator.iter_errors(spec), 1):
        print(f"[{idx}] {error.message}")

In [8]:
schema_path = seqspec_root / "seqspec"/ "schema" / "seqspec.schema.json"

with open(schema_path, "rt") as instream:
    seqspec_schema = json.load(instream)

In [9]:
def load_spec(filename):
    with open(filename, "rt") as instream:
        data = yaml.load(instream, Loader=yaml.Loader)
        for r in data.assay_spec:
            r.set_parent_id(None)
    return data

In [10]:
generic_spec = load_spec(seqspec_root / "specs" / "parse-wt-v2" / "wt-mega-v2.yaml")

seqspec_validate(seqspec_schema, generic_spec.to_dict())

# Template

This function is a template that takes my generic parse split-seq seqspec file and instantiates it with the fastq accessions for the files it's attached to.

In [11]:
def generate_parse_wt_yaml(read1_accession, read2_accession):
    """Instantiate a generic seqspec template with specific accession ids
    """
    return f"""!Assay
seqspec_version: 0.0.0
assay: WT-Mega-v2
sequencer: Illumina
doi: ""
publication_date: ""
name: WT Mega v2
description: split-pool ligation-based transcriptome sequencing
modalities:
- rna
lib_struct: ""
assay_spec:
- !Region
  region_id: {read1_accession}.fastq.gz
  region_type: fastq
  name: Read 1 sequence Fastq
  sequence_type: random
  sequence: X
  min_len: 100
  max_len: 150
  onlist: null
  regions:
  - !Region
    region_id: cDNA
    region_type: cdna
    name: cDNA
    sequence_type: random
    sequence: X
    min_len: 1
    max_len: 150
    onlist: null
    regions: null
    parent_id: {read1_accession}.fastq.gz
- !Region
  region_id: {read2_accession}.fastq.gz
  region_type: fastq
  name: Read 2 umi and barcode FASTQ
  sequence_type: joined
  sequence: NNNNNNNNNNNNNNNNNNGTGGCCGATGTTTCGCATCGGCGTACGACTNNNNNNNNATCCACGTGCTTGAGACTGTGGNNNNNNNN
  min_len: 86
  max_len: 86
  onlist: null
  regions:
  - !Region
    region_id: umi
    region_type: umi
    name: umi
    sequence_type: random
    sequence: NNNNNNNNNN
    min_len: 10
    max_len: 10
    onlist: null
    regions: null
    parent_id: {read2_accession}.fastq.gz
  - !Region
    region_id: barcode-1
    region_type: barcode
    name: barcode-1
    sequence_type: onlist
    sequence: NNNNNNNN
    min_len: 8
    max_len: 8
    onlist: !Onlist
      filename: barcode-1_onlist_v2.txt
      md5: 5c3b70034e9cef5de735dc9d4f3fdbde
    regions: null
    parent_id: {read2_accession}.fastq.gz
  - !Region
    region_id: linker-1
    region_type: linker
    name: linker-1
    sequence_type: fixed
    sequence: GTGGCCGATGTTTCGCATCGGCGTACGACT
    min_len: 30
    max_len: 30
    onlist: null
    regions: null
    parent_id: {read2_accession}.fastq.gz
  - !Region
    region_id: barcode-2
    region_type: barcode
    name: barcode-2
    sequence_type: onlist
    sequence: NNNNNNNN
    min_len: 8
    max_len: 8
    onlist: !Onlist
      filename: barcode-23_onlist.txt
      md5: 1452e8ef104e6edf686fab8956172072
    regions: null
    parent_id: {read2_accession}.fastq.gz
  - !Region
    region_id: linker-3
    region_type: linker
    name: linker-3
    sequence_type: fixed
    sequence: ATCCACGTGCTTGAGACTGTGG
    min_len: 22
    max_len: 22
    onlist: null
    regions: null
    parent_id: {read2_accession}.fastq.gz
  - !Region
    region_id: barcode-3
    region_type: barcode
    name: barcode-3
    sequence_type: onlist
    sequence: NNNNNNNN
    min_len: 8
    max_len: 8
    onlist: !Onlist
      filename: barcode-23_onlist.txt
      md5: 1452e8ef104e6edf686fab8956172072
    regions: null
    parent_id: {read2_accession}.fastq.gz
"""

Now make sure that the template validates correctly.

In [12]:
example_yaml = generate_parse_wt_yaml("TSTFI61612395", "TSTFI25832476")
example_spec = load_spec_stream(StringIO(example_yaml))
seqspec_validate(seqspec_schema, example_spec.to_dict())
print(example_yaml)

!Assay
seqspec_version: 0.0.0
assay: WT-Mega-v2
sequencer: Illumina
doi: ""
publication_date: ""
name: WT Mega v2
description: split-pool ligation-based transcriptome sequencing
modalities:
- rna
lib_struct: ""
assay_spec:
- !Region
  region_id: TSTFI61612395.fastq.gz
  region_type: fastq
  name: Read 1 sequence Fastq
  sequence_type: random
  sequence: X
  min_len: 100
  max_len: 150
  onlist: null
  regions:
  - !Region
    region_id: cDNA
    region_type: cdna
    name: cDNA
    sequence_type: random
    sequence: X
    min_len: 1
    max_len: 150
    onlist: null
    regions: null
    parent_id: TSTFI61612395.fastq.gz
- !Region
  region_id: TSTFI25832476.fastq.gz
  region_type: fastq
  name: Read 2 umi and barcode FASTQ
  sequence_type: joined
  sequence: NNNNNNNNNNNNNNNNNNGTGGCCGATGTTTCGCATCGGCGTACGACTNNNNNNNNATCCACGTGCTTGAGACTGTGGNNNNNNNN
  min_len: 86
  max_len: 86
  onlist: null
  regions:
  - !Region
    region_id: umi
    region_type: umi
    name: umi
    sequence_type: rand

I have submission spreadsheets very similar to the IGVF DACC google sheet, but stored on our groups server as .xlsx files. These nextcloud urls will remotely download the spreadsheet.

In [13]:
submitted_book_names = {
    "igvftst": {
        "IGVF_b01": "https://woldlab.caltech.edu/nextcloud/index.php/s/5cJteSWgitN5BDM/download",
    }
}

sequence_file = pandas.read_excel(submitted_book_names["igvftst"]["IGVF_b01"], "sequence_file")

# Working out steps needed to create seqspec objects

Here I am testing a set of pandas filters to make sure that I can pair the already submitted fastqs correctly.

In [14]:
for i, read1 in sequence_file[sequence_file["illumina_read_type"] == "R1"].iterrows():
    file_set_filter = (sequence_file["file_set"] == read1.file_set)
    flowcell_filter = (sequence_file["flowcell_id"] == read1.flowcell_id)
    lane_filter = (sequence_file["lane:integer"] == read1["lane:integer"])
    read2_filter = (sequence_file["illumina_read_type"] == "R2")
    read2s = sequence_file[file_set_filter & flowcell_filter & lane_filter & read2_filter]
    assert read2s.shape[0] == 1
    read2 = read2s.loc[read2s.first_valid_index()]
    print(i, read1.file_set, read1.accession, read2.accession, read1.md5sum)


0 ali-mortazavi:B01_13A_illumina TSTFI61612395 TSTFI25832476 702ca9574354a99fd009dce261fd9ee7
2 ali-mortazavi:B01_13A_illumina TSTFI76281026 TSTFI85921201 aa7dc0237ddb4349805abfb2f8c73012
4 ali-mortazavi:B01_13B_illumina TSTFI02165763 TSTFI20418101 243fb7c21baeedd32453c2f6c395eb9c
6 ali-mortazavi:B01_13B_illumina TSTFI91347763 TSTFI55411036 19ff5f7886592febc9b50cdfd147573b
8 ali-mortazavi:B01_13C_illumina TSTFI29302178 TSTFI86035976 847d94a593c3afcf8c8a0e85f4e39719
10 ali-mortazavi:B01_13C_illumina TSTFI13094701 TSTFI31391738 54e7f6c10013765156641e9a1f088841
12 ali-mortazavi:B01_13D_illumina TSTFI93762898 TSTFI67627476 dc802213c02fe80066b5919e60dffb83
14 ali-mortazavi:B01_13D_illumina TSTFI85205262 TSTFI60184104 dacdbccc732eef94b81a4cd2bb52f8e6
16 ali-mortazavi:B01_13E_illumina TSTFI52650531 TSTFI99387349 88f14665143d0ad153dcd551e5494194
18 ali-mortazavi:B01_13E_illumina TSTFI54335009 TSTFI34448948 f5ecf5e40c04969726b358a8547d1b5e
20 ali-mortazavi:B01_13F_illumina TSTFI51842628 TSTFI26

Now that we know what sets of files are supposed to be grouped together with which dataset, we can test, a single seqspec file.

In [15]:
read1 = "TSTFI61612395"
read2 = "TSTFI25832476"
file_set = "ali-mortazavi:B01_13A_illumina"

example_yaml = generate_parse_wt_yaml(read1, read2)
example_spec = load_spec_stream(StringIO(example_yaml))
seqspec_validate(seqspec_schema, example_spec.to_dict())

md5 = hashlib.md5(example_yaml.encode('utf-8'))

seqspec_file = {
    "award": award,
    "lab": lab,
    "md5sum": md5.hexdigest(),
    "file_format": "yaml",
    "file_set": file_set,
    "content_type": "seqspec"
}
igvf_validator.validate(seqspec_file, "configuration_file")

try:
    print(server.get_json("md5:{}".format(seqspec_file["md5sum"])))
except encoded.HTTPError as err:
    if err.response.status_code == 404:
        print("Should upload")
        # This posts the metadata to describe the seqspec file to the configuration_file collection.
        # post_json turns into calls 
        # requests.post(
        #   "https://api.sandbox.igvf.org/configuration_file",
        #   auth=(token,,key), 
        #   headers={"content-type": "application/json", "accept": "application/json"},
        #   data=seqspec_file)
        print(server.post_json("configuration_file", seqspec_file))

{'lab': {'@id': '/labs/ali-mortazavi/', 'title': 'Ali Mortazavi, UCI'}, 'award': {'component': 'mapping', '@id': '/awards/HG012077/'}, 'md5sum': '60e97148d27579d6749cfba948b9982e', 'status': 'in progress', 'file_set': '/measurement-sets/TSTDS34582101/', 'accession': 'TSTFI70244117', 'file_format': 'yaml', 'content_type': 'seqspec', 'submitted_by': {'@id': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'title': 'Diane Trout'}, 'upload_status': 'pending', 'schema_version': '1', 'creation_timestamp': '2023-08-28T17:47:33.494967+00:00', '@id': '/configuration-files/TSTFI70244117/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '23c7e5ce-614c-4b05-91ea-1e92690e27c6', 'summary': 'TSTFI70244117', 'href': '/configuration-files/TSTFI70244117/@@download/TSTFI70244117.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/23c7e5ce-614c-4b05-91ea-1e92690e27c6/TSTFI70244117.yaml', 'seqspec_of': ['/sequence-files/TSTFI25832476/', '/sequence-files/TSTFI61612395/'], '@context': '/terms/', 'ac

In [16]:
seqspec_file

{'award': '/awards/HG012077/',
 'lab': '/labs/ali-mortazavi/',
 'md5sum': '60e97148d27579d6749cfba948b9982e',
 'file_format': 'yaml',
 'file_set': 'ali-mortazavi:B01_13A_illumina',
 'content_type': 'seqspec'}

Make sure that the computation of the seqspec_file is reproducible

In [17]:
seqspec_file["md5sum"] == "60e97148d27579d6749cfba948b9982e"

True

For convenience while I was figuring out how this worked I stored the credentials extracted from the return value of server.post_json. I've sanitized out the random keys from what I actually used.

In [25]:
credentials = {'session_token': "redacted",
    'access_key': "redacted",
    'expiration': '2023-08-30T05:47:33+00:00',
    'secret_key': "redacted",
    'upload_url': 's3://igvf-files-staging/2023/08/28/23c7e5ce-614c-4b05-91ea-1e92690e27c6/TSTFI70244117.yaml',
    'federated_user_arn': 'arn:aws:sts::920073238245:federated-user/up1693244853.500758-TSTFI7024411',
    'federated_user_id': '920073238245:up1693244853.500758-TSTFI7024411',
    'request_id': '5e0de40c-13a2-4485-b52c-c71d1dc364a4'}


!rm onlist_joined.txt

assert False

response = requests.get("https://woldlab.caltech.edu/~diane/parse_barcodes/bc1_n192_v4.txt.gz")
response.headers

This is the most minimal example, using credentials returned by the post, you can create the s3_client, and 
then upload a string as a file object to s3.

In [38]:
s3_client = boto3.client(
    's3', 
    aws_access_key_id=credentials["access_key"], 
    aws_secret_access_key=credentials["secret_key"], 
    aws_session_token=credentials["session_token"])

s3_client.upload_fileobj(
    BytesIO(example_yaml.encode("utf-8")),
    "igvf-files-staging", 
    "2023/08/28/23c7e5ce-614c-4b05-91ea-1e92690e27c6/TSTFI70244117.yaml")


However once a seqspec configuration file has been uploaded, you then need to go back and update the fastqs to point to the newly created seqspec configuration object.

This updates one of the initial fastqs, with the seqspec accession I was given after I uploaded the first object.

In [43]:
server.patch_json("/sequence-files/{}/".format(read1), {"seqspec": "TSTFI70244117"})

{'status': 'success',
 '@type': ['result'],
 '@graph': [{'lab': '/labs/ali-mortazavi/',
   'lane': 1,
   'award': '/awards/HG012077/',
   'md5sum': '702ca9574354a99fd009dce261fd9ee7',
   'status': 'in progress',
   'file_set': '/measurement-sets/TSTDS34582101/',
   'accession': 'TSTFI61612395',
   'file_size': 12763646361,
   'read_count': 164354092,
   'file_format': 'fastq',
   'flowcell_id': 'AAC2L2NHV',
   'content_type': 'reads',
   'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/',
   'upload_status': 'validated',
   'content_md5sum': '000dbdcba5227cc2e4e0f0e6dd0aa0fb',
   'sequencing_run': 1,
   'mean_read_length': 140,
   'creation_timestamp': '2023-05-18T23:58:24.746368+00:00',
   'illumina_read_type': 'R1',
   'maximum_read_length': 140,
   'minimum_read_length': 140,
   'sequencing_platform': '/platform-terms/EFO_0010963/',
   'submitted_file_name': 'igvf_b01/next1/B01_13A_R1.fastq.gz',
   'seqspec': '/configuration-files/TSTFI70244117/',
   'schema_version': '3

Now patch the other read of the pair.

In [44]:
server.patch_json("/sequence-files/{}/".format(read2), {"seqspec": "TSTFI70244117"})

{'status': 'success',
 '@type': ['result'],
 '@graph': [{'lab': '/labs/ali-mortazavi/',
   'lane': 1,
   'award': '/awards/HG012077/',
   'md5sum': 'bcb77fea303a7fe2fd80ee18d8dd77fc',
   'status': 'in progress',
   'file_set': '/measurement-sets/TSTDS34582101/',
   'accession': 'TSTFI25832476',
   'file_size': 6888924961,
   'read_count': 164354092,
   'file_format': 'fastq',
   'flowcell_id': 'AAC2L2NHV',
   'content_type': 'reads',
   'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/',
   'upload_status': 'validated',
   'content_md5sum': '57fc5990e668fc21c3e0126df4dddc99',
   'sequencing_run': 1,
   'mean_read_length': 86,
   'creation_timestamp': '2023-05-19T00:05:21.811199+00:00',
   'illumina_read_type': 'R2',
   'maximum_read_length': 86,
   'minimum_read_length': 86,
   'sequencing_platform': '/platform-terms/EFO_0010963/',
   'submitted_file_name': 'igvf_b01/next1/B01_13A_R2.fastq.gz',
   'seqspec': '/configuration-files/TSTFI70244117/',
   'schema_version': '3',
 

# Seqspec Submission functions

Now that we've done it in stages for one pair of fastqs, lets turn it into a program.

In [99]:
def parse_s3_url(url):
    """Extract out the path portion of a s3 uri
    """
    url = urlparse(url)
    assert url.scheme == "s3", "Not s3 url {}".format(url)
    
    return url.netloc, url.path[1:]

def post_seqspec(seqspec_metadata, seqspec_contents):
    """Post a seq spec metadata object to the portal and upload the seqspec object as a file to s3
    """
    response = server.post_json("configuration_file", seqspec_metadata)
    print(response)

    s3_client = boto3.client(
        's3', 
        aws_access_key_id=credentials["access_key"], 
        aws_secret_access_key=credentials["secret_key"], 
        aws_session_token=credentials["session_token"])

    bucket, target = parse_s3_url(credentials["upload_url"])
    s3_client.upload_fileobj(
        BytesIO(seqspec_contents.encode("utf-8")),
        bucket, 
        target)
    return response    

def register_seqspec(file_set, read1, read2, dry_run=True):
    """Create the seqspec objects and attach them to the fastqs
    
    Paramweters:
    - file_set: id the seqspec should be attached to
    - read1, read2: the two fastq ids for the paired end RNA-seq reads.
    
    Other assays might need more options like the I1 read or customizing 
    the read lengths
    """
    
    # Generate the seqspec yaml data and load it into the seqspec objects
    seqspec_contents = generate_parse_wt_yaml(read1, read2)
    example_spec = load_spec_stream(StringIO(seqspec_contents))
    
    # Validate the seqspec against the seqspec schema.
    # this is depending on a global variable for the schema
    seqspec_validate(seqspec_schema, example_spec.to_dict())

    # Generate a MD5 sum for the seqspec yaml file
    md5 = hashlib.md5(seqspec_contents.encode('utf-8'))

    # Construct the configuration_file object for the DACC portal
    seqspec_metadata = {
        "award": award,
        "lab": lab,
        "md5sum": md5.hexdigest(),
        "file_format": "yaml",
        "file_set": file_set,
        "content_type": "seqspec"
    }
    # Make sure the configuration_file passes the DACCs schema
    igvf_validator.validate(seqspec_metadata, "configuration_file")

    # Search the portal for the md5sum of our seqspec file to see if 
    # it has already been submitted
    try:
        response = server.get_json("md5:{}".format(seqspec_metadata["md5sum"]))
        print("found object by {}".format(seqspec_metadata["md5sum"]))
        print(response)
        seqspec_metadata.update({
            "@id": response["@id"],
            "accession": response["accession"],
            "uuid": response["uuid"],
        })
    except encoded.HTTPError as err:
        # If the file has not been submitted, and we're not in dry_run mode 
        # lets submit it
        if err.response.status_code == 404:
            if dry_run:
                print("Would upload {}".format(md5.hexdigest()))
            else:
                response = post_seqspec(seqspec_metadata, seqspec_contents)
                print("creating object")
                print(response)
                if response["status"] == "success":
                    print("Upload succeeded")
                    graph = response["@graph"][0]
                    seqspec_metadata.update({
                        "@id": graph["@id"],
                        "accession": graph["accession"],
                        "uuid": graph["uuid"],
                    }) 
                else:
                    print(response)
                    raise RuntimeError("Unable to create metadata object")

        else:
             print("Other HTTPError error {}".format(err.response.status_code))

    # Once the seqspec configuration file and metadata has been created and uploaded
    # attach the the configuration file to it's fastqs.
    for read in [read1, read2]:
        read_id = "/sequence-files/{}/".format(read)
        fileinfo = server.get_json(read_id.format(read))
        print(fileinfo)
        if not "seqspec" in fileinfo:
            print("Need to post seqspec")
            if not dry_run:
                print(server.patch_json(read_id, {"seqspec": seqspec_metadata["@id"]}))
        elif fileinfo["seqspec"] != seqspec_metadata["@id"]:
            print("WARNING: seqspec accessions do not match, {} {}".format(fileinfo["seqspec"], seqspec_metadata["@id"]))

    return seqspec_metadata

# Testing submission functions

Test the register_seqspec function with the next set of files.

In [100]:
read1 = "TSTFI61612395"
read2 = "TSTFI25832476"
file_set = "ali-mortazavi:B01_13A_illumina"
register_seqspec(file_set, read1, read2, dry_run=True)

found object by 60e97148d27579d6749cfba948b9982e
{'lab': {'@id': '/labs/ali-mortazavi/', 'title': 'Ali Mortazavi, UCI'}, 'award': {'component': 'mapping', '@id': '/awards/HG012077/'}, 'md5sum': '60e97148d27579d6749cfba948b9982e', 'status': 'in progress', 'file_set': '/measurement-sets/TSTDS34582101/', 'accession': 'TSTFI70244117', 'file_format': 'yaml', 'content_type': 'seqspec', 'submitted_by': {'@id': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'title': 'Diane Trout'}, 'upload_status': 'pending', 'schema_version': '1', 'creation_timestamp': '2023-08-28T17:47:33.494967+00:00', '@id': '/configuration-files/TSTFI70244117/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '23c7e5ce-614c-4b05-91ea-1e92690e27c6', 'summary': 'TSTFI70244117', 'href': '/configuration-files/TSTFI70244117/@@download/TSTFI70244117.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/23c7e5ce-614c-4b05-91ea-1e92690e27c6/TSTFI70244117.yaml', '@context': '/terms/', 'actions': [{'name': 'edit', 'title': 

{'award': '/awards/HG012077/',
 'lab': '/labs/ali-mortazavi/',
 'md5sum': '60e97148d27579d6749cfba948b9982e',
 'file_format': 'yaml',
 'file_set': 'ali-mortazavi:B01_13A_illumina',
 'content_type': 'seqspec',
 '@id': '/configuration-files/TSTFI70244117/',
 'accession': 'TSTFI70244117',
 'uuid': '23c7e5ce-614c-4b05-91ea-1e92690e27c6'}

# Create seqspec objects for remaining fastqs

Now that we have a way to list all of the fastq sets, and a function to post everything to the portal
loop through all of our fastqs posting the seqspec configuration files

In [107]:
submitted = []

for i, read1 in sequence_file[sequence_file["illumina_read_type"] == "R1"].iterrows():
    file_set_filter = (sequence_file["file_set"] == read1.file_set)
    flowcell_filter = (sequence_file["flowcell_id"] == read1.flowcell_id)
    lane_filter = (sequence_file["lane:integer"] == read1["lane:integer"])
    read2_filter = (sequence_file["illumina_read_type"] == "R2")
    read2s = sequence_file[file_set_filter & flowcell_filter & lane_filter & read2_filter]
    assert read2s.shape[0] == 1
    read2 = read2s.loc[read2s.first_valid_index()]
    #print(i, read1.file_set, read1.accession, read2.accession, read1.md5sum)
    
    submitted.append(register_seqspec(read1.file_set, read1.accession, read2.accession, dry_run=True))


found object by 60e97148d27579d6749cfba948b9982e
{'lab': {'@id': '/labs/ali-mortazavi/', 'title': 'Ali Mortazavi, UCI'}, 'award': {'component': 'mapping', '@id': '/awards/HG012077/'}, 'md5sum': '60e97148d27579d6749cfba948b9982e', 'status': 'in progress', 'file_set': '/measurement-sets/TSTDS34582101/', 'accession': 'TSTFI70244117', 'file_format': 'yaml', 'content_type': 'seqspec', 'submitted_by': {'@id': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'title': 'Diane Trout'}, 'upload_status': 'pending', 'schema_version': '1', 'creation_timestamp': '2023-08-28T17:47:33.494967+00:00', '@id': '/configuration-files/TSTFI70244117/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '23c7e5ce-614c-4b05-91ea-1e92690e27c6', 'summary': 'TSTFI70244117', 'href': '/configuration-files/TSTFI70244117/@@download/TSTFI70244117.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/23c7e5ce-614c-4b05-91ea-1e92690e27c6/TSTFI70244117.yaml', '@context': '/terms/', 'actions': [{'name': 'edit', 'title': 

Error http status: 404 for https://api.sandbox.igvf.org/md5:b7acd22e4ab0c0f3dc4f5c41ca276219


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': 'b7acd22e4ab0c0f3dc4f5c41ca276219', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS07432728/', 'content_type': 'seqspec', 'accession': 'TSTFI73855650', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:31:56.437636+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI73855650/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '4a9a4aa8-b061-4840-87fc-8dd132fabb91', 'summary': 'TSTFI73855650', 'href': '/configuration-files/TSTFI73855650/@@download/TSTFI73855650.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/4a9a4aa8-b061-4840-87fc-8dd132fabb91/TSTFI73855650.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDAgx+V39kTFgYkK0vCKpAnmulddqPUjF1N112yFitlHrcgkYDBqwDWguMnAeUG6r5sGOlPw//6XU8he/+hv76+fun/hAx

Error http status: 404 for https://api.sandbox.igvf.org/md5:8a676344abc1b8a9b9cda746c3d892e1


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '8a676344abc1b8a9b9cda746c3d892e1', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS07432728/', 'content_type': 'seqspec', 'accession': 'TSTFI17129253', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:31:58.293085+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI17129253/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '8e731d92-0058-4c5c-b4dc-1430e9164361', 'summary': 'TSTFI17129253', 'href': '/configuration-files/TSTFI17129253/@@download/TSTFI17129253.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/8e731d92-0058-4c5c-b4dc-1430e9164361/TSTFI17129253.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDK3N04s5zGsYIYy9HyKpAh329YjSUsXPdb058g/hLSOcelUW5ytGExp+rjARse8500vwz2CMXYfImNqC74/Ajp0TGb1V9

Error http status: 404 for https://api.sandbox.igvf.org/md5:bdb44aa1b71c7d218dea5aeaac3bb8b9


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': 'bdb44aa1b71c7d218dea5aeaac3bb8b9', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS72923185/', 'content_type': 'seqspec', 'accession': 'TSTFI90057538', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:00.340939+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI90057538/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '86e1e142-431a-44c9-b2f4-112e85d81361', 'summary': 'TSTFI90057538', 'href': '/configuration-files/TSTFI90057538/@@download/TSTFI90057538.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/86e1e142-431a-44c9-b2f4-112e85d81361/TSTFI90057538.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDCI5pZ0gSHKtraRtTiKpAjEJyjcx4ngHpCAENee2nfjneZeKsp10/15h6cjhebPTP2QwHa4RJKk+tOmy0z/U+BqyceJ80

Error http status: 404 for https://api.sandbox.igvf.org/md5:7fccac1f53b30056ab84a40e28258b5e


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '7fccac1f53b30056ab84a40e28258b5e', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS72923185/', 'content_type': 'seqspec', 'accession': 'TSTFI80690541', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:02.038377+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI80690541/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '892f8b9e-fb02-4e45-b85b-6371abf38eac', 'summary': 'TSTFI80690541', 'href': '/configuration-files/TSTFI80690541/@@download/TSTFI80690541.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/892f8b9e-fb02-4e45-b85b-6371abf38eac/TSTFI80690541.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDOUbvqUTNrcmFKdRWCKpAvdmpHPCbtwRd3Sgvp9SzY90vYjvITebEmD7IRIohZh5tZf88DQ4p6sw3YmXeyhbJtAjdvCDo

Error http status: 404 for https://api.sandbox.igvf.org/md5:5bc8b6728b0bce7e60368d2ac8b82771


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '5bc8b6728b0bce7e60368d2ac8b82771', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS95237342/', 'content_type': 'seqspec', 'accession': 'TSTFI33277521', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:03.634138+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI33277521/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '7a422dc1-5255-4614-975a-be2d68b544e0', 'summary': 'TSTFI33277521', 'href': '/configuration-files/TSTFI33277521/@@download/TSTFI33277521.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/7a422dc1-5255-4614-975a-be2d68b544e0/TSTFI33277521.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDKgSELmqIfdx6hQc7iKpAh6QgWA92C5bYKf5anGKKwdExajCGxd58RSh9/DkAMtro2MzEdwshQ8S3fICnALoEij0X96QJ

Error http status: 404 for https://api.sandbox.igvf.org/md5:b5f38e829c8f008cc56546f36b117524


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': 'b5f38e829c8f008cc56546f36b117524', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS95237342/', 'content_type': 'seqspec', 'accession': 'TSTFI72232735', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:05.311304+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI72232735/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '26869245-c662-43b1-97b8-fcd9d8717d0a', 'summary': 'TSTFI72232735', 'href': '/configuration-files/TSTFI72232735/@@download/TSTFI72232735.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/26869245-c662-43b1-97b8-fcd9d8717d0a/TSTFI72232735.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDOYByiI7RUdNG2APeCKpAvUA91LoQv133vaDYbaPjJmR53sYu+BzFYIPbMttoFRh1K591aRXuDku0iFSME7Y6Na8+XlUw

Error http status: 404 for https://api.sandbox.igvf.org/md5:3ddfbc075f1a4e0a1cdb36f57b1d2fcb


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '3ddfbc075f1a4e0a1cdb36f57b1d2fcb', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS51545328/', 'content_type': 'seqspec', 'accession': 'TSTFI09860062', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:06.997719+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI09860062/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': 'cd057efe-2196-4ed8-a8af-b3a0b48b6d27', 'summary': 'TSTFI09860062', 'href': '/configuration-files/TSTFI09860062/@@download/TSTFI09860062.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/cd057efe-2196-4ed8-a8af-b3a0b48b6d27/TSTFI09860062.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDPdAoA6Z5+eN0b38NiKpAr1nvnFw3eVDS+kgw7Ft7fEMMGBPv8CbhtroJzNIGLeSA4UVCGNabOwlexzy93FEQ82ojUo+A

Error http status: 404 for https://api.sandbox.igvf.org/md5:b88388286b0e1b09698d6616968ba5f4


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': 'b88388286b0e1b09698d6616968ba5f4', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS51545328/', 'content_type': 'seqspec', 'accession': 'TSTFI09917232', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:08.675649+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI09917232/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': 'b6c743d1-c902-478b-85fe-cd0245bfd1e5', 'summary': 'TSTFI09917232', 'href': '/configuration-files/TSTFI09917232/@@download/TSTFI09917232.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/b6c743d1-c902-478b-85fe-cd0245bfd1e5/TSTFI09917232.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDI9QkCN/h3xpakAb7SKpAsRsq8ngkw+icXhqeMSFXkyQD0k1ej7adP9Ast777aV3gCdPXYvm4l4cLVnUF5uAgely5oUnL

Error http status: 404 for https://api.sandbox.igvf.org/md5:5b50de64f8eb80a504e768199d51cf98


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '5b50de64f8eb80a504e768199d51cf98', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS76216718/', 'content_type': 'seqspec', 'accession': 'TSTFI14083765', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:10.413018+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI14083765/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '6a4dbf8d-67d2-4113-9119-f3c54e224c24', 'summary': 'TSTFI14083765', 'href': '/configuration-files/TSTFI14083765/@@download/TSTFI14083765.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/6a4dbf8d-67d2-4113-9119-f3c54e224c24/TSTFI14083765.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDM0FsK5aYShZblVRviKpAsgCREyO+jEaREpU7QdqfcHBoEkHJuOnRRvDDRyL2OGvyEsz2SHYsCPsWeOqs235Mm+M8cPMP

Error http status: 404 for https://api.sandbox.igvf.org/md5:2cc3748e61b35a42ffe6fe49baaf041d


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '2cc3748e61b35a42ffe6fe49baaf041d', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS76216718/', 'content_type': 'seqspec', 'accession': 'TSTFI45451912', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:12.150784+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI45451912/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': 'e4e156ef-9a87-4834-8b5a-3a25417e06d9', 'summary': 'TSTFI45451912', 'href': '/configuration-files/TSTFI45451912/@@download/TSTFI45451912.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/e4e156ef-9a87-4834-8b5a-3a25417e06d9/TSTFI45451912.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDHbh7D0l9zM34G3FUSKpAjH1XLxbYoM9MVIbrU1XtZViC6aOSibsDW/eQexRqNeHWe1Giwy0fL+U8kT6CfrHrS8qA0PIi

Error http status: 404 for https://api.sandbox.igvf.org/md5:e2e1969e2b3ef7976715925752b59590


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': 'e2e1969e2b3ef7976715925752b59590', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS02882566/', 'content_type': 'seqspec', 'accession': 'TSTFI37065899', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:13.768330+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI37065899/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '99e130da-a29a-4f7a-80bd-a188b8e32f09', 'summary': 'TSTFI37065899', 'href': '/configuration-files/TSTFI37065899/@@download/TSTFI37065899.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/99e130da-a29a-4f7a-80bd-a188b8e32f09/TSTFI37065899.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDOiUy5buAB/f4LSGNyKpAhV8qHH00csnNdGabjNGtKczG1owKIkbk3LCoBr5TWuZ9HAeSTILdFst8oH8YygheMAmy45Xt

Error http status: 404 for https://api.sandbox.igvf.org/md5:5d53ab3f0244254d8c77a107b4cb93f8


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '5d53ab3f0244254d8c77a107b4cb93f8', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS02882566/', 'content_type': 'seqspec', 'accession': 'TSTFI86095895', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:15.480191+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI86095895/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': 'b2c053c0-787f-4169-a131-25e12f09f248', 'summary': 'TSTFI86095895', 'href': '/configuration-files/TSTFI86095895/@@download/TSTFI86095895.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/b2c053c0-787f-4169-a131-25e12f09f248/TSTFI86095895.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDJiaIIo99xH91FdBOiKpAoJrIiDMgM5PQr9Wy9mnCUXIWUQTWy+Pq5i7G23GokC40bLhkZxnEyst/tVuDILVDTwtDuFWW

Error http status: 404 for https://api.sandbox.igvf.org/md5:34decfb4ef790d7165b9af3119745fae


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '34decfb4ef790d7165b9af3119745fae', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS70002954/', 'content_type': 'seqspec', 'accession': 'TSTFI99456038', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:17.178301+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI99456038/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': '2794e1d9-2548-4b80-8d0c-297e7124d991', 'summary': 'TSTFI99456038', 'href': '/configuration-files/TSTFI99456038/@@download/TSTFI99456038.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/2794e1d9-2548-4b80-8d0c-297e7124d991/TSTFI99456038.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDIH8teKcV06VzdwS0yKpAoQOo+tXGa4XH8ozrfwJ70Hx+62QDTHz5m/zEbeEQ8P29awtdacmeZT6ynQERFekTHCa8A3BF

Error http status: 404 for https://api.sandbox.igvf.org/md5:4667e4b1dabdcb8585d05ddcb0bc6815


{'status': 'success', '@type': ['result'], '@graph': [{'award': '/awards/HG012077/', 'lab': '/labs/ali-mortazavi/', 'md5sum': '4667e4b1dabdcb8585d05ddcb0bc6815', 'file_format': 'yaml', 'file_set': '/measurement-sets/TSTDS70002954/', 'content_type': 'seqspec', 'accession': 'TSTFI89917683', 'status': 'in progress', 'schema_version': '1', 'creation_timestamp': '2023-08-28T23:32:18.844467+00:00', 'submitted_by': '/users/ebd1025b-e3ab-4e51-bfcd-1415f278281a/', 'upload_status': 'pending', '@id': '/configuration-files/TSTFI89917683/', '@type': ['ConfigurationFile', 'File', 'Item'], 'uuid': 'cbc57b39-da2c-44b6-9dc5-90fc240ddd23', 'summary': 'TSTFI89917683', 'href': '/configuration-files/TSTFI89917683/@@download/TSTFI89917683.yaml', 's3_uri': 's3://igvf-files-staging/2023/08/28/cbc57b39-da2c-44b6-9dc5-90fc240ddd23/TSTFI89917683.yaml', 'upload_credentials': {'session_token': 'FwoGZXIvYXdzEOH//////////wEaDJD2YB0PG2V5pez7ZSKpAmkMhL9jFyOLWUPbUY8yXsJBzgnehlaqBCkv3wlk11vOzUSQXLczeHy4rAGSIRHu5NPa/oh9K

Convert the results of creating into a pandas data frame so I can store the return values in a my submission spreadsheet.

In [108]:
pandas.DataFrame(submitted)[["accession", "uuid", "file_set", "content_type", "file_format", "md5sum", "award", "lab"]]


Unnamed: 0,accession,uuid,file_set,content_type,file_format,md5sum,award,lab
0,TSTFI70244117,23c7e5ce-614c-4b05-91ea-1e92690e27c6,ali-mortazavi:B01_13A_illumina,seqspec,yaml,60e97148d27579d6749cfba948b9982e,/awards/HG012077/,/labs/ali-mortazavi/
1,TSTFI11442035,68ccc026-0e6e-4a4c-b91a-dbd1f04524c7,ali-mortazavi:B01_13A_illumina,seqspec,yaml,276e000aa7d8667d19efc0e5f63d0798,/awards/HG012077/,/labs/ali-mortazavi/
2,TSTFI73855650,4a9a4aa8-b061-4840-87fc-8dd132fabb91,ali-mortazavi:B01_13B_illumina,seqspec,yaml,b7acd22e4ab0c0f3dc4f5c41ca276219,/awards/HG012077/,/labs/ali-mortazavi/
3,TSTFI17129253,8e731d92-0058-4c5c-b4dc-1430e9164361,ali-mortazavi:B01_13B_illumina,seqspec,yaml,8a676344abc1b8a9b9cda746c3d892e1,/awards/HG012077/,/labs/ali-mortazavi/
4,TSTFI90057538,86e1e142-431a-44c9-b2f4-112e85d81361,ali-mortazavi:B01_13C_illumina,seqspec,yaml,bdb44aa1b71c7d218dea5aeaac3bb8b9,/awards/HG012077/,/labs/ali-mortazavi/
5,TSTFI80690541,892f8b9e-fb02-4e45-b85b-6371abf38eac,ali-mortazavi:B01_13C_illumina,seqspec,yaml,7fccac1f53b30056ab84a40e28258b5e,/awards/HG012077/,/labs/ali-mortazavi/
6,TSTFI33277521,7a422dc1-5255-4614-975a-be2d68b544e0,ali-mortazavi:B01_13D_illumina,seqspec,yaml,5bc8b6728b0bce7e60368d2ac8b82771,/awards/HG012077/,/labs/ali-mortazavi/
7,TSTFI72232735,26869245-c662-43b1-97b8-fcd9d8717d0a,ali-mortazavi:B01_13D_illumina,seqspec,yaml,b5f38e829c8f008cc56546f36b117524,/awards/HG012077/,/labs/ali-mortazavi/
8,TSTFI09860062,cd057efe-2196-4ed8-a8af-b3a0b48b6d27,ali-mortazavi:B01_13E_illumina,seqspec,yaml,3ddfbc075f1a4e0a1cdb36f57b1d2fcb,/awards/HG012077/,/labs/ali-mortazavi/
9,TSTFI09917232,b6c743d1-c902-478b-85fe-cd0245bfd1e5,ali-mortazavi:B01_13E_illumina,seqspec,yaml,b88388286b0e1b09698d6616968ba5f4,/awards/HG012077/,/labs/ali-mortazavi/
