<a href="https://colab.research.google.com/github/seek4science/stress-testing/blob/main/sample_type_creation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install num2words
import num2words

Import the libraries so that they can be used within the notebook

* **requests** is used to make HTTP calls
* **json** is used to encode and decode strings into JSON
* **string** is used to perform text manipulation and checking
* **getpass** is used to do non-echoing password input

In [None]:
import requests
import json
import string
import getpass

The **base_url** holds the URL to the SEEK instance that will be used in the notebook

**headers** holds the HTTP headers that will be sent with every HTTP call

* **Content-type: application/vnd.api+json** - indicates that any data sent will be in JSON API format
* **Accept: application/vnd.api+json** - indicates that the notebook expects any data returned to be in JSON API format
* **Accept-Charset: ISO-8859-1** - indicates that the notebook expects any text returned to be in ISO-8859-1 character set

In [None]:
base_url = 'https://sandbox10.fairdomhub.org/'

headers = {"Content-type": "application/vnd.api+json",
           "Accept": "application/vnd.api+json",
           "Accept-Charset": "ISO-8859-1"}

Create a **requests** HTTP **Session**. A **Session** has re-usable settings such as **headers**

The **authorization** is username and password. The user is prompted for this information.

In [None]:
session = requests.Session()
session.headers.update(headers)
session.auth = (input('Username: '), getpass.getpass('Password: '))

The sample types will be created within **Project** 3

In [None]:
containing_project_id = 3


In [None]:
x = '''{
  "data": {
    "type": "sample_types",
    "attributes": {
      "title": "Two columns API",
      "description": "",
      "sample_attributes": [
        {
          "title": "title",
          "description": "",
          "pid": "",
          "sample_attribute_type": {
            "id": 8,
            "title": "String",
            "base_type": "String",
            "regexp": ".*"
          },
          "required": true,
          "pos": "1",
          "unit": null,
          "is_title": true,
          "sample_controlled_vocab_id": null,
          "linked_sample_type_id": null
        }
      ],
      "tags": []
    },
    "relationships": {
      "projects": {
        "data": [
          {
            "id": "2",
            "type": "projects"
          }
        ]
      },
      "submitter": {
        "data": [
          {
            "id": "2",
            "type": "people"
          }
        ]
      }
    }
  }
}'''

In [None]:
sample_data = {
    "data": {
        "type": "samples",
        "attributes": {
            "title": "t",
            "attribute_map": {
                "title": "t"
            }
        },
        "relationships": {
            "creators": {
                "data": [
                    {
                        "id": "2",
                        "type": "people"
                    }
                ]
            },
            "submitter": {
                "data": [
                    {
                        "id": "2",
                        "type": "people"
                    }
                ]
            },
            "projects": {
                "data": [
                    {
                        "id": "2",
                        "type": "projects"
                    }
                ]
            },
            "sample_type": {
                "data": {
                    "id": "st",
                    "type": "sample_types"
                }
            },
            "people": {
                "data": [
                    {
                        "id": "2",
                        "type": "people"
                    }
                ]
            }
        }
    }
}

In [None]:
from num2words import num2words


In [None]:
def create_sampletype(column_count) :
  x2 = json.loads(x)
  x2['data']['attributes']['title'] = num2words(column_count).capitalize() + ' columns API'

  for i in range(1, column_count):
    to_add =  {}
    to_add['title'] = 'attr_' + str(i)
    to_add['sample_attribute_type'] = {}
    to_add['sample_attribute_type']['id'] = '8'
    to_add['required'] = 'false'
         
    x2['data']['attributes']['sample_attributes'].append(to_add)
  return(x2)

In [None]:
from pprint import pprint

In [None]:
import copy


In [None]:
def handle_sampletype(column_count, sample_count):
  r = session.post(base_url + 'sample_types', json=create_sampletype(column_count))
  r.raise_for_status()
  j = r.json()
  sampletype_id = j['data']['id']

  singletons = []
  for i in range(1, sample_count + 1):
    s = sample_data.copy()
    s['data']['attributes']['title'] = 't_' + str(i)
    s['data']['attributes']['attribute_map']['title'] = 't_' + str(i)
    s['data']['relationships']['sample_type']['data']['id'] = sampletype_id

    for c in range(1, column_count):
      s['data']['attributes']['attribute_map']['attr_' + str(c)] = 'v_' + str(c) + '_' + str(i)
    
    singletons.append (copy.deepcopy(s))

  return (copy.deepcopy(singletons))

In [None]:
def post_samples(posts):
  for s in posts:
    r = session.post(base_url + 'samples', json=s)
    r.raise_for_status()

In [None]:
def post_batch_samples(post):
  r = session.post(base_url + 'samples/batch_create', json=post)
  r.raise_for_status()

In [None]:
%time singletons = handle_sampletype(10, 2000)

In [None]:
%time post_samples(singletons)

In [None]:
batch_samples = {'data' : handle_sampletype(10,2000)}

The post_batch_samples is expected to fail with a Bad Gateway error. This is due to a timeout in the communication with the server.

In [None]:
%time post_batch_samples(batch_samples)

In [None]:
handle_sampletype(100, 1000)

In [None]:
def search_samples (s):
  r = session.get(base_url + 'search?search_type=samples&q=' + s)
  r.raise_for_status()

In [None]:
%time search_samples('v_6_')