#  DSA Schema basic example (Masoud)

This notebook allows you to test a basic schema for the ome csv files in use with DSA servers.

The csv has **6** columns besides the csv default index one.
The csv converted to JSON file to test with the designed schema.

---

## Contents

1. [Lib Setup](#Lib-Setup)
   
1. [CSV Download from DSA](#CSV-Download)
    1. [Read CSV File](#CSV-Read)
    1. [Convert to JSON](#CSV-to-JSON)
1. [OME-Schema](#OME-Schema)
    1. [Validate JSON File](#JSON-Validation)



## Setup <a class="anchor" id= "Lib-Setup"></a>

In [1]:
import girder_client
import os,sys
import pandas as pd
import numpy as np
import json
import csv
import sys
import getpass
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

! pip install jsonschema    
from jsonschema import validate  
!pip install emoji --upgrade
import emoji

Requirement already up-to-date: emoji in /home/mohamed/anaconda2/lib/python2.7/site-packages (0.6.0)


## CSV file download <a class="anchor" id= "CSV-Download"></a>

In [3]:
csv_file_id = "5f3d8b53c0ac4ed1ea110f9b"
request_url = "item/" + csv_file_id + "/download?contentDisposition=attachment"
base_url = "https://styx.neurology.emory.edu/girder/api/v1/"
gc = girder_client.GirderClient(apiUrl = base_url)
gc.authenticate(username=None, password=None, interactive=True, apiKey=None)
csvfile = gc.get( request_url, jsonResp=False)

Login or email: masoud
Password for masoud: ········


## CSV Read <a class="anchor" id= "CSV-Read"></a>

In [4]:

raw_data = StringIO(csvfile.content)
df = pd.read_csv(raw_data)
df

Unnamed: 0,channel_number,cycle_number,marker_name,label,excitation_wavelength,emission_wavelength
0,0,0,DNA 1,Hoechst 33342,395,431
1,1,0,A488 background,none,485,525
2,2,0,A555 background,none,555,590
3,3,0,A647 background,none,640,690
4,4,1,DNA 2,Hoechst 33342,395,431
5,5,1,A488 background,Alexa 488,485,525
6,6,1,A555 background,Alexa 555,555,590
7,7,1,A647 background,Alexa 647,640,690
8,8,2,DNA 3,Hoechst 33342,395,431
9,9,2,A488 background,Alexa 488,485,525


## Convert to JSON <a class="anchor" id= "CSV-to-JSON"></a>

In [57]:
result = df.to_json(orient="records")
jsonFile = json.loads(result)
jsonFile[0]

{u'channel_number': 0,
 u'cycle_number': 0,
 u'emission_wavelength': 431,
 u'excitation_wavelength': 395,
 u'label': u'Hoechst 33342',
 u'marker_name': u'DNA 1'}

## OME Schema <a class="anchor" id= "OME-Schema"></a>

In [65]:
ome_schema = {
    "$schema": "http://json-schema.org/draft-07/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "title": "The root schema",
    "description": "The root schema comprises the entire JSON document.",
    "default": {},
    "examples": [
        {
            "channel_number": "0",
            "cycle_number": "0",
            "emission_wavelength": "431",
            "excitation_wavelength": "395",
            "label": "Hoechst 33342",
            "marker_name": "DNA 1",
        }
    ],
    "required": [
        "channel_number",
        "cycle_number",
        "emission_wavelength",
        "excitation_wavelength",
        "label",
        "marker_name",

    ],
    "properties": {
        "channel_number": {
            "$id": "#/properties/channel_number",
            "type": "number",
            "title": "The channel_number schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "0"
            ]
        },
        "cycle_number": {
            "$id": "#/properties/cycle_number",
            "type": "number",
            "title": "The cycle_number schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "0"
            ]
        },
        "emission_wavelength": {
            "$id": "#/properties/Proc_Seq",
            "type": "number",
            "title": "The emission_wavelength schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "431"
            ]
        },
        "excitation_wavelength": {
            "$id": "#/properties/excitation_wavelength",
            "type": "number",
            "title": "The excitation_wavelength schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "395"
            ]
        },
        "label": {
            "$id": "#/properties/label",
            "type": "string",
#             "minLength": "2",
#             "pattern": "[^#?)($]{1,}+ [0-9]{1,}$",
            "pattern": "([A-Z]{1,})|([a-z]{1,})",            
            "title": "The label schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "Hoechst 33342"
            ]
        },
        "marker_name": {
            "$id": "#/properties/marker_name",
            "type": ["number", "string"] ,
#             "minLength": "1",
            "pattern": "([A-Z]{1,})|([a-z]{1,})",
            "title": "The marker_name schema",
            "description": "An explanation about the purpose of this instance.",
            "default": "",
            "examples": [
                "DNA 1"
            ]
        }
    }
}

In [66]:
#For validation test
#jsonFile[0]["label"] = 5

## Validate JSON File <a class="anchor" id= "JSON-Validation"></a>

In [67]:
not_valid_entry = 0
for idx, entry in enumerate(jsonFile):
    try:
     validate(instance = entry, schema = ome_schema)
    except :
      print("Not valid record index", idx)
      not_valid_entry += 1
#     else:
#       print(idx, " valid record")   

if not_valid_entry:
       print(emoji.emojize('Final result file is not valid :thumbs_down:'))
else:
       print(emoji.emojize('Final result file is valid :thumbs_up:'))

Final result file is valid 👍


In [None]:
%%javascript
Jupyter.notebook.save_checkpoint();
Jupyter.notebook.session.delete();