# Collecting responses via Nettskjema API
The following notebook details how to use the Nettskjema API to retrieve responses collected via these webforms. This example combines informatiom from the exiting documentation on this API's structure (https://utv.uio.no/docs/nettskjema/api/), code samples of the python library request, documentation of the r library nettskjemar, and some trial and error.

In order to access the Nettskjema API, you need:

    1) a token (string key) with editing rights to the form in questions
    2) a connection from a suitable IP address (such as on the UIO network)
    3) suitable commands for retrieving the data

## Getting access to the API
In order to access a form through the API, you need to generate an api account for your uio account, generate a unique token with suitable roles for that api account, and grant that api account editing privileges for the form(s) you want to access via the API. This is the link to set up and edit your api account within the nettskjema webinterface: https://nettskjema.no/user/api/index.html

First creat an api account with a simple name and description, then click on that generated account and generate a token with suitable rolls and IP restrictions. Tokens are strings that act as keys so the system knows who is logging in to the API and that they have permission to do specific things. If you are only downloading responses via API (instead of setting up and editing forms), your token needs only the roles []READ_SUBMISSIONS and []READ_FORMS. If you leave the default IP address range, you can access the API from computers on the UiO network. This include machines logged into remotely (like through VMware Horizon).

When you generate/save the token, the next screen shows you the token string. **COPY AND PASTE THIS INTO A FILE RIGHT AWAY as you will never be able to retrieve it again.** (Though it is easy to just generate another token if needed.) This notebook reads a local directory file called 'nettskjema_token.txt' to get the token string.  

Once you have the token with suitable roles, click over to your form. Under the rights tab (Rettigheter) should be a list of nettskjema users who have editing rights on this form. So long as you are in one of those accounts, you can add your api account as: "*yourapiaccountname*@api". Note: Once you save, it will show up as "*yourapiaccountname* @ api" with spaces arounf the @, but there should not be spaces in the username when you are pasting it in to grant acess. You can check if the rights have been granted properly by going back to your api account details (under https://nettskjema.no/user/api/index.html) and making sure the form is listed in the Skjemaer table.

Once the rights have been granted, we can use the token to access the API programatically. The API documentation gives examples of curl commands that can be run from the command like or terminal (https://utv.uio.no/docs/nettskjema/api/). Below are examples of performing the same tasks with python's request library. Also available is an r library developped by UiO research group LCBC to pull response data directly into r data formats, *nettskjemar*: https://lcbc-uio.github.io/nettskjemar/index.html

# Accessing the API via Request

In [18]:
import requests
import json
import zipfile
import os
import base64
import time

Navigate to the directory storing your *nettskjema_token.txt* file and read the API token stored there.

In [2]:
# os.chdir('M:\\finnu\\kant\\div-ritmo-u1')
# os.chdir('M:\\research')
os.listdir()

['PullingMusicLab.ipynb',
 '.DS_Store',
 'PullingNettskjema.ipynb',
 'CompressedData',
 'README.md',
 '.gitignore',
 'copen_netts.tsv',
 'Test_API',
 '.ipynb_checkpoints',
 'Copendata',
 '.git',
 'nettskjema_token.txt']

In [3]:
f=open('nettskjema_token.txt','r')
TOKEN = f.read()
f.close()

Set up a request session with the token information saved into the authentication settings. This allows us to skip spelling out the token string in this notebook with every API request sent.

In [4]:
session = requests.Session()
session.headers.update({'Authorization': 'Bearer ' + TOKEN})

Test the token by sending a request to see its expiration date.

In [5]:
# confirm the token is working on this IP with the check on expiry date
request_url = "https://nettskjema.no/api/v2/users/admin/tokens/expire-date"
response = session.get(request_url)
response.content

b'{"expireDate":"2022-10-18T17:38:29.000+0200"}'

 If the token is broken or wrong or too old, you will get an error message like: 

`b'{"statusCode":400,"message":"Not token authenticated","errors":null,"nestedErrors":null}'`

If the token is recognised, the output will be just the expiry date:

`b'{"expireDate":"2022-10-18T17:38:29.000+0200"}'`

Note: the 'b' before the response string indicates that the API reponse is transmited in bytes. The formate is important to handle when trying to store the collected API response data. The standard function .decode() converts the byte string into something python interpretable. 

## Calling for Form metadata

If the token is working, next request information about the form you want to get data from. For this you need the formID number, a unique integer assigned by Nettskjema when the form was created. This is at the end of the form URL (ex: 225781 in https://nettskjema.no/a/225781). The page describing which forms your API account has access to also includes these ID numbers (https://nettskjema.no/user/api/index.html#/user)

The basic request to retreive metadata gives the description of who has access and editing rights, some history of the form, and the content of the form. The information returned to requests about forms are json files which can easily be converted into python dictionaries. 

It is possible to delete and edit forms through the API, but this isn't described here. The request url formates for these functions should be deducable from the curl commands listed in the API instructions at https://utv.uio.no/docs/nettskjema/api/

In [6]:
# example metadata API request with simple form of one question.

# equivalent curl command
#  $ curl 'https://nettskjema.no/api/v2/forms/225781' -i -X GET -H 'Authorization: Bearer TOKEN'

formID = 225781
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID)
response = session.get(request_url) # using the request session call which includes the saved API token
form_metadata = json.loads(response.content.decode()) # inteprete recieved string into a python native datatype
form_metadata # show the information output

{'formId': 225781,
 'languageCode': 'en',
 'title': '0 Physically Present Participant Number',
 'deliveryDestination': 'DATABASE',
 'formType': 'DEFAULT',
 'theme': 'DEFAULT',
 'createdBy': {'personId': 616293,
  'username': 'danasw@uio.no',
  'fullName': 'Dana Swarbrick',
  'name': 'Dana Swarbrick',
  'type': 'LOCAL'},
 'modifiedBy': {'personId': 1676908,
  'username': 'finnu@uio.no',
  'fullName': 'Finn Upham',
  'name': 'Finn Upham',
  'type': 'LOCAL'},
 'createdDate': '2021-10-21T09:39:49.000+0200',
 'modifiedDate': '2021-10-28T16:29:27.000+0200',
 'openFrom': '2021-10-27T09:44:45.000+0200',
 'respondentGroup': 'ALL',
 'editorsContactEmail': 'danasw@uio.no',
 'editorsSubmissionEmailType': 'NONE',
 'editors': [{'personId': 1914058,
   'username': 'finnu1@api',
   'fullName': 'finn ritmo access',
   'name': 'finn ritmo access',
   'type': 'API'},
  {'personId': 1676908,
   'username': 'finnu@uio.no',
   'fullName': 'Finn Upham',
   'name': 'Finn Upham',
   'type': 'LOCAL'},
  {'perso

If you do not have access rights to a form, or if you are trying to access the API from an IP address that isn't in the range specified by your token, you get API errors instead, like:

 `{'statusCode': 403,
 'message': 'No access to form with id 225782.',
 'errors': None,
 'nestedErrors': None}`
 
 `{'statusCode': 404,
 'message': 'Could not find form with id 22578.',
 'errors': None,
 'nestedErrors': None}`
 
 

In [7]:
formID = 225782
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID)
response = session.get(request_url) # using the request session call which includes the saved API token
form_metadata = json.loads(response.content.decode()) # inteprete recieved string into a python native datatype
form_metadata # show the information output

{'statusCode': 403,
 'message': 'No access to form with id 225782.',
 'errors': None,
 'nestedErrors': None}

The addition of '/submissions' to the request URL calls instead for metadata on the responses, called "submissions" by the API. The returned json file is here converted into a list of dictionaries with standard information about each submission (submission ID, created and modified dates, etc.) as well as all the form responses. 

Responses are returned in reverse chronological order: the last response is the first submission in the list. 

In [8]:
formID = 225781
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID) + '/submissions' 
response = session.get(request_url) # using the request session call which includes the saved API token
sub_metadata = json.loads(response.content.decode()) # inteprete recieved string into a python native datatype
sub_metadata[:2] # examples of the last two responses recieved

[{'submissionId': 16674015,
  'createdDate': '2021-10-27T00:00:00.000+0200',
  'modifiedDate': '2021-10-27T00:00:00.000+0200',
  'delivered': True,
  'answerTime': 0,
  'answers': [{'answerId': 102748640,
    'questionId': 3741786,
    'textAnswer': '40bf6287-e5a4-8a30-8863-cc71c9ef7225'},
   {'answerId': 102748641, 'questionId': 3741787, 'textAnswer': 'Finn'}]},
 {'submissionId': 16653694,
  'createdDate': '2021-10-26T00:00:00.000+0200',
  'modifiedDate': '2021-10-26T00:00:00.000+0200',
  'delivered': True,
  'answerTime': 0,
  'answers': [{'answerId': 102664391,
    'questionId': 3741786,
    'textAnswer': '08ff9d8a-6bc2-897f-24d3-91eb10717741'}]}]

It is possible to request subsets of responses, specifically all responses after either a specific submission date or submission ID. Submission IDs increase monotonically, assigned uniquely across all of Nettskjema.no. A conveninent trick when downloading responses from an active survey is to call for only those submissions recieved since the last time the data was downloaded. To call only subsets, the request url gets extended with "&fromDate=" or "&fromSubmissionID=" with the appropriately formated threshold. 

It is also possible to download only the submission ID feild, instead of the full submission details with the addition of "?fields=submissionId". At this time, no other feilds can isolated this way.

In [9]:
# how to call submissions from after a certain date

# curl command template
# $ curl 'https://nettskjema.no/api/v2/forms/8432376/submissions?fields=submissionId&fromDate=2021-01-11T13%3A43%3A17.486%2B0100' -i -X GET -H 'Authorization: Bearer TOKEN'

formID = 141510
# 2021-10-25T08:27:22+01:00 # remembrer URL encoding? : is %3A , + is %2B, https://www.w3schools.com/tags/ref_urlencode.ASP
date = '2021-10-25T08%3A27%3A22.000%2B0100'

request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID) + '/submissions?fields=submissionId&fromDate=' + date

response = session.get(request_url)
subIDs = json.loads(response.content.decode())

print('Submissions since date ' + date + ': ' + str(len(subIDs)))
if len(subIDs)<5:
    print(subIDs)
else:
    print(subIDs[:2])


Submissions since date 2021-10-25T08%3A27%3A22.000%2B0100: 21559
[{'submissionId': 16794081}, {'submissionId': 16789319}]


Note: Forms that do not collect personal information on Nettskjema do not retain dates in a format that can be used for this kind of range restriction. You will get this error when trying to call a subset of responses by date: 

`{'statusCode': 409, 'message': 'Since the form does not collect personal data, the submissions will not have dates to compare with the fromDate parameter', 'errors': None, 'nestedErrors': None}`

In this case, it is necessary to find a suitable submissionID that corresponds to the same temporal threshold. If you are monitoring an active survey, use the ID of the first `[0]` submissionID from your last API call. 

In [10]:
# how to call submissions from after a certain ID, restricting metadata to submission ID
# curl command template"
# $ curl 'https://nettskjema.no/api/v2/forms/225781/submissions?fields=submissionId&fromSubmissionId=16653694' -i -X GET \
#   -H 'Authorization: Bearer TOKEN'

formID = 225781
submissionID = 16653000
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID) + '/submissions?fields=submissionId&fromSubmissionId=' + str(submissionID)
response = session.get(request_url)
subIDs = json.loads(response.content.decode())
print('Submissions since subID ' + str(submissionID) + ': ' + str(len(subIDs)))
if len(subIDs)<5: # print error message or top responses 
    print(subIDs)
else:
    print(subIDs[0])


Submissions since subID 16653000: 3
[{'submissionId': 16674015}, {'submissionId': 16653694}, {'submissionId': 16653300}]


# Gathering Musiclab phone sensor data
The above commands cover access to forms that collect information strickly through the webform interface. Apps like Musiclab also gather information in different shapes that are stored by nettskjema as attachments to submissions (responses). These are a bit trickier to retrieve, but still accessible through the API. 

Note: the following cells will not run without permissions for the MusicLab form on Nettskjema, but the shape should be the same for any forms that collects attachments with submissions.

In [11]:
# get metadata on submissions for the music lab app after a certain submission ID

formID = 141510
submissionID = 16561653
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID) + '/submissions?fromSubmissionId=' + str(submissionID)
response = session.get(request_url)
subIDs = json.loads(response.content.decode())
print('Submissions since subID ' + str(submissionID) + ': ' + str(len(subIDs)))
if len(subIDs)<5: # print error message or top responses 
    print(subIDs)
else:
    print(subIDs[0])


Submissions since subID 16561653: 23523
{'submissionId': 16794081, 'createdDate': '2021-10-31T20:11:18.000+0100', 'modifiedDate': '2021-10-31T20:11:18.000+0100', 'delivered': True, 'answerTime': 0, 'answers': [{'answerId': 103297368, 'questionId': 1996787, 'textAnswer': 'fb5a1b4d-e5c4-638c-1a08-932a34fd07b4'}, {'answerId': 103297367, 'questionId': 1996788, 'textAnswer': 'data.zip', 'attachments': [{'answerAttachmentId': 302899, 'fileName': 'data.zip', 'mediaType': 'application/zip'}]}]}


In order to retrieve the sensor data stored in the submission attachment, we have to call for each file individually and save it appropriately. Here are the essential details from one submission out of the metadata called above. 

In [12]:
subn = 0
print('submissionID : ' + str(subIDs[subn]['submissionId']))
for ans in subIDs[subn]['answers']:
    if 'textAnswer' in ans:
        if len(ans['textAnswer'])>12:
            print('Submitting instillation: ' + ans['textAnswer'])
    if 'attachments' in ans:
        print(ans['attachments'])

submissionID : 16794081
Submitting instillation: fb5a1b4d-e5c4-638c-1a08-932a34fd07b4
[{'answerAttachmentId': 302899, 'fileName': 'data.zip', 'mediaType': 'application/zip'}]


so to call the attachment for that submission we use the request:

In [14]:
subn = 0 # just calling one as an example

subID = str(subIDs[subn]['submissionId'])
for ans in subIDs[subn]['answers']:
    if 'attachments' in ans:
        attID = str(ans['attachments'][0]['answerAttachmentId'])
request_url = 'https://nettskjema.no/api/v2/submissions/' + subID + '/attachments/' + attID
response = session.get(request_url)

att_dets = json.loads(response.content.decode())
print('fileName: ' + att_dets['fileName'])
print('fileSize: ' + str(att_dets['fileSize']))
print('mediaType: ' + att_dets['mediaType'])
print('content: ' + att_dets['content'][:500] + '...')

fileName: data.zip
fileSize: 51850
mediaType: application/zip
content: UEsDBAoAAAAAAGiZX1Nnsy6jgcgAAIHIAAA1AAAAZmI1YTFiNGQtZTVjNC02MzhjLTFhMDgtOTMyYTM0ZmQwN2I0LmRldmljZU1vdGlvbi5jc3Z0aW1lc3RhbXAsdGltZSx4LHkseixhbHBoYSxiZXRhLGdhbW1hDQoxNjM1NzA3NDcwMTMwLDM4MjQxLjg5OTk5OTk5MzYyLDAuOTI3OTk5OTczMjk3MTE5MSw1LjcwNjk5OTc3ODc0NzU1OSw3Ljg1MzAwMDE2NDAzMTk4Miw4NC4xNjA1MTE3ODMwMDgyNiwtMjIuMDc2NTgwMTIzOTk3NTI3LDEwLjA4OTU1ODAwMzI2Mzc3NQ0KMTYzNTcwNzQ3MDE0NCwzODI1Ni45MDAwMDAwMDc1OSwwLjg5OTk5OTk3NjE1ODE0MjEsNS42NTk5OTk4NDc0MTIxMDksNy42OTk5OTk4MDkyNjUxMzcsODQuMTYwNTExNzgzMDA4MjYsLTIy...


The nettskjema API returns attachments as 64encoded zipfiles in byte strings. To make these readable, we need to decode then save the string as a zip file and then unzip them. Thankfully there are python libraries for this.  

First be sure you are in a suitable local folder, then unpack the attachement.

In [31]:
#os.mkdir('./Test_API')
os.chdir('Test_API')

In [16]:
subn = 0 # just calling one as an example

subID = str(subIDs[subn]['submissionId'])
for ans in subIDs[subn]['answers']:
    if 'attachments' in ans:
        attID = str(ans['attachments'][0]['answerAttachmentId'])
request_url = 'https://nettskjema.no/api/v2/submissions/' + subID + '/attachments/' + attID
response = session.get(request_url)

att_dets = json.loads(response.content.decode())

# write the decoded attachment into a zip file
f=open('data.zip', 'wb')
f.write(base64.b64decode(att_dets['content']))
f.close()

# and then unzip that file, leaving a uniquely titled csv, I hope
with zipfile.ZipFile('data.zip', 'r') as zip_ref:
    if not os.path.exists(str(subID)):
        os.mkdir(str(subID))
        zip_ref.extractall('./'+str(subID)) # Not unique filenames so use the unique submission IDs 

print(os.listdir())
os.chdir('./'+str(subID))
print(os.listdir())
os.chdir('..')

['.DS_Store', '16759222', 'data.zip', '16767882', '16794081', '16758800', '16767010', '16765118', '16766563', '16767008', '16765068', '16767014', '16767875']
['fb5a1b4d-e5c4-638c-1a08-932a34fd07b4.geoLocation.csv', 'fb5a1b4d-e5c4-638c-1a08-932a34fd07b4.deviceMotion.csv']


Here we have a minute recording from a device with the unique installation ID 'cfcd73d7-4a...' in a format that is easy to read. 

The files within the zip are named for the device and information type, but do not include the submission number. This means if they are opened into a folder that already contains a previous recording from the same device, the previously saved recording will be overwritten and lost. To avoid this, the files are unzipped within a folder names for that unique submission. 

Now to collect many at once: 

In [32]:
# pull in attachment files and unpack them

checkedSubs = 25

newSubs = 0 # count the submission sampled
tic = time.time()
for submis in subIDs[:checkedSubs]: # just getting 10 as a test
    # first find out the attachment file ID for this submission
    subID = str(submis['submissionId'])
    # if there is an IDed attachment for this submission, get the file
    for subm in submis['answers']:
        if len(subm)>3: # cheat to pick out only submissions with attachments. might fail.
            attID = str(subm['attachments'][0]['answerAttachmentId'])
            request_url = 'https://nettskjema.no/api/v2/submissions/' + subID + '/attachments/' + attID
            response = session.get(request_url)
            newSubs += 1
            att_dets = json.loads(response.content.decode())
            # write the decoded attachment into a zip file
            f=open('data.zip', 'wb')
            f.write(base64.b64decode(att_dets['content']))
            f.close()
            # and then unzip that file, leaving a uniquely titled csv, I hope
            with zipfile.ZipFile('data.zip', 'r') as zip_ref:
                if not os.path.exists(str(subID)):
                    os.mkdir(str(subID))
                    zip_ref.extractall('./'+str(subID)) # if not unique can use the unique submission IDs 

print('time to collect ' + str(newSubs) + ' attachments: ' + str(time.time() - tic))

time to collect 25 attachments: 2.028269052505493


The files can then be crawled for with suitable information about which submissions related to which instillations, i.e., what can be sewn together in order. 

## compressing submission files
For some forms of storage and retrieval of unzipped data, this folder-per-submission arrangement is really awkward. The following shows two reorganisation schemes. The first moves the files from a long list of folders to a single folder while adding the submission number to filenames to preserve uniqueness. The second organises the files into unique installation folders.

In [36]:
import shutil

In [38]:
os.chdir('..')
os.mkdir('CompressedData')

In [35]:
folders = os.listdir('Test_API')
folders 

['16786391',
 '16780206',
 '16786273',
 '.DS_Store',
 '16777969',
 '16786008',
 'data.zip',
 '16767882',
 '16794081',
 '16767010',
 '16786383',
 '16767008',
 '16781376',
 '16789317',
 '16789319',
 '16777633',
 '16783995',
 '16777971',
 '16785268',
 '16785269',
 '16778331',
 '16787114',
 '16786004',
 '16775525',
 '16767014',
 '16767875',
 '16777820']

In [39]:
# to put all the files inside the CompressedData folder for fastest sftp transfer
for subid in folders:
    if subid.startswith('1'):
        filenames = os.listdir('./Test_API/'+str(subid))
        #print(filenames)
        for fn in filenames:
            if fn.endswith('.csv'):
                sourcefile = './Test_API/'+str(subid) + '/' + fn
                targetfile = './CompressedData/'+str(subid) + '.' + fn
                #print(targetfile)
                shutil.copy2(sourcefile,targetfile)
                
os.listdir('./CompressedData/')

['16767010.cc61a83b-5b8a-6939-6294-5466a6a9d573.deviceMotion.csv',
 '16789319.1f3c39de-1f3f-757d-1c8b-b2c89439d0b3.deviceMotion.csv',
 '16777971.782ab0f9-d8db-34c4-8420-3700e1bbf564.deviceMotion.csv',
 '16786383.fa598a6c-ad16-7169-3686-0242cce08021.deviceMotion.csv',
 '16786273.34a001ef-4d44-38c4-f35c-4104070edbf6.geoLocation.csv',
 '16777633.8f9cd059-3495-5fae-96fb-3e9c043bff11.deviceMotion.csv',
 '16786008.a62945b6-3ac0-e1d4-63f5-cec7c57aa561.deviceMotion.csv',
 '16781376.6d324af2-a252-ecab-2268-ca4aa2e26b99.deviceMotion.csv',
 '16786004.a62945b6-3ac0-e1d4-63f5-cec7c57aa561.deviceMotion.csv',
 '16783995.5c30134c-c0b1-f3d8-c9d8-9ac523608db3.deviceMotion.csv',
 '16786391.fa598a6c-ad16-7169-3686-0242cce08021.deviceMotion.csv',
 '16767010.cc61a83b-5b8a-6939-6294-5466a6a9d573.geoLocation.csv',
 '16785269.0a68f772-30e3-2436-cc64-1444a0805a86.deviceMotion.csv',
 '16777969.782ab0f9-d8db-34c4-8420-3700e1bbf564.deviceMotion.csv',
 '16767008.cc61a83b-5b8a-6939-6294-5466a6a9d573.deviceMotion.csv

To organise the files into folders per installation, we monitor and generate new folders as needed.

In [40]:
os.mkdir('InstOrdData') 


In [44]:
# to put all the files inside the CompressedData folder for fastest sftp transfer
foldlist = os.listdir('./InstOrdData/')

for subid in folders:
    if subid.startswith('1'):
        filenames = os.listdir('./Test_API/'+str(subid))
        #print(filenames)
        for fn in filenames:
            if fn.endswith('.csv'):
                # extract the installation ID 
                subdets = fn.split('.')
                instid = subdets[0]
                # if the device doesn't have a folder, generate one
                if instid not in foldlist:
                    os.mkdir('./InstOrdData/' + instid)
                    foldlist = os.listdir('./InstOrdData/')
                    
                sourcefile = './Test_API/' + str(subid) + '/' + fn
                targetfile = './InstOrdData/' + instid + '/'+str(subid) + '.' + fn
                #print(targetfile)
                shutil.copy2(sourcefile,targetfile)
                
os.listdir('./InstOrdData/')

['a62945b6-3ac0-e1d4-63f5-cec7c57aa561',
 '2c213490-5bad-6fe9-3c7e-db7fbcf99877',
 '0a68f772-30e3-2436-cc64-1444a0805a86',
 'fa598a6c-ad16-7169-3686-0242cce08021',
 'a6e6bc47-8c07-4c7b-8e6e-5076393a0444',
 '782ab0f9-d8db-34c4-8420-3700e1bbf564',
 '8f9cd059-3495-5fae-96fb-3e9c043bff11',
 '79c31318-394a-d873-0463-7db0dc4d7d2b',
 '5c30134c-c0b1-f3d8-c9d8-9ac523608db3',
 '6d324af2-a252-ecab-2268-ca4aa2e26b99',
 '1f3c39de-1f3f-757d-1c8b-b2c89439d0b3',
 '34a001ef-4d44-38c4-f35c-4104070edbf6',
 'fb5a1b4d-e5c4-638c-1a08-932a34fd07b4',
 '97d16b6e-81fa-3a8b-e467-45b7d431861a',
 'cfcd73d7-4af9-08e2-aa6a-e73e87bbfbf3',
 'b01f3d3e-d53f-d92e-b60a-ed81be656974',
 'cc61a83b-5b8a-6939-6294-5466a6a9d573']

Now, if we need to use sftp to move the data, we don't need to crawl through thousands of folders to find it. 

# Check out other forms
Try to convert other forms response data into convenient formats for analysis. The json files are not super convenient for python analysis strategies. This section is a Work in Progress.

In [47]:
import pandas as pd

In [48]:
# to begin, the list of forms associated with the musiclab app for the copenhagen experiment. 
formes = [141510,225781,225357,225336,225387,225707,225692,225713,225713,225702,225709,225693,225714,225695,225711]

In [58]:
# examples of the data shape as recieved by the API

formID = formes[9]
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID)
response = session.get(request_url) # using the request session call which includes the saved API token
form_metadata = json.loads(response.content.decode()) # inteprete recieved string into a python native datatype
form_metadata # show the information output

{'formId': 225702,
 'languageCode': 'en',
 'title': '4 After Bach (English) MusicLab Copenhagen',
 'deliveryDestination': 'DATABASE',
 'formType': 'DEFAULT',
 'theme': 'DEFAULT',
 'createdBy': {'personId': 616293,
  'username': 'danasw@uio.no',
  'fullName': 'Dana Swarbrick',
  'name': 'Dana Swarbrick',
  'type': 'LOCAL'},
 'modifiedBy': {'personId': 1676908,
  'username': 'finnu@uio.no',
  'fullName': 'Finn Upham',
  'name': 'Finn Upham',
  'type': 'LOCAL'},
 'createdDate': '2021-10-20T18:31:06.000+0200',
 'modifiedDate': '2021-10-29T09:35:51.000+0200',
 'openFrom': '2021-10-20T18:41:57.000+0200',
 'respondentGroup': 'ALL',
 'editorsContactEmail': 'danasw@uio.no',
 'editorsSubmissionEmailType': 'NONE',
 'editors': [{'personId': 573969,
   'username': 'kaylab@uio.no',
   'fullName': 'Kayla Burnim',
   'name': 'Kayla Burnim',
   'type': 'LOCAL'},
  {'personId': 1914058,
   'username': 'finnu1@api',
   'fullName': 'finn ritmo access',
   'name': 'finn ritmo access',
   'type': 'API'},
  

In [59]:
# checkout out the form structure
elem = form_metadata['elements']

for m in elem:
    print(str(m['elementId']) + ' ' + m['elementType'])
    if 'questions' in m: 
        for q in m['questions']:
            print('\t Qid: ' + str(q['questionId']))
            print('\t Q: ' + str(q['text']))
            if 'answerOptions' in q:
                for a in q['answerOptions']:
                    print('\t \t' + a['text'])

3484591 QUESTION
	 Qid: 3740387
	 Q: userID
3484592 RADIO
	 Qid: 3740388
	 Q: Did you perceive how the visualisation related to the music?
	 	1 (Not at all)
	 	2
	 	3
	 	4
	 	5 (Totally)
3484593 RADIO
	 Qid: 3740389
	 Q: Did the coloured lines and discs help you follow the different instruments?
	 	1 (Not at all)
	 	2
	 	3
	 	4
	 	5 (Totally)
3484594 RADIO
	 Qid: 3740390
	 Q: Did the visualization influence your understanding of the structural elements of the music?
	 	1 (Not at all)
	 	2
	 	3
	 	4
	 	5 (Totally)
3484595 RADIO
	 Qid: 3740391
	 Q: Did the visualization augment your appreciation of the repetition of melodic themes?
	 	1 (Not at all)
	 	2
	 	3
	 	4
	 	5 (Totally)
3484596 RADIO
	 Qid: 3740392
	 Q: Compared to seeing a performance without a visualization, did this visualization disturb and degrade your appreciation of the piece?
	 	1 (Not at all)
	 	2
	 	3
	 	4
	 	5 (Totally)
3484597 RADIO
	 Qid: 3740393
	 Q: Did the visualization enhance your experience of the music?
	 	1 

In [60]:
# checking out the form of responses

formID = formes[9]
request_url = 'https://nettskjema.no/api/v2/forms/' + str(formID) + '/submissions?fromSubmissionId=1'
response = session.get(request_url)
subIDs = json.loads(response.content.decode())#eval(response.content.decode())
print(len(subIDs))
if len(subIDs)<5:
    print(subIDs)
else:
    print(subIDs[0])
    

29
{'submissionId': 16670536, 'createdDate': '2021-10-26T00:00:00.000+0200', 'modifiedDate': '2021-10-26T00:00:00.000+0200', 'delivered': True, 'answerTime': 0, 'answers': [{'answerId': 102726077, 'questionId': 3740392, 'answerOptions': [{'answerOptionId': 8636659, 'sequence': 3, 'text': '3', 'preselected': False, 'correct': False}]}, {'answerId': 102726072, 'questionId': 3740394, 'answerOptions': [{'answerOptionId': 8636667, 'sequence': 1, 'text': '1 (Not at all)', 'preselected': False, 'correct': False}]}, {'answerId': 102726069, 'questionId': 3740398, 'answerOptions': [{'answerOptionId': 8636691, 'sequence': 2, 'text': 'Yes and I moved less than usual', 'preselected': False, 'correct': False}]}, {'answerId': 102726071, 'questionId': 3740395, 'answerOptions': [{'answerOptionId': 8636678, 'sequence': 7, 'text': '7 (I am very familiar with the music)', 'preselected': False, 'correct': False}]}, {'answerId': 102726074, 'questionId': 3740397, 'answerOptions': [{'answerOptionId': 8636688,

In [61]:
# first layer of organisation into pandas
df = pd.read_json(response.content.decode())
# first flush, answers hid a wealth of information. Would be good to get that broken out 
df.loc[:3,:]

Unnamed: 0,submissionId,createdDate,modifiedDate,delivered,answerTime,answers
0,16670536,2021-10-26T00:00:00.000+0200,2021-10-26T00:00:00.000+0200,True,0,"[{'answerId': 102726077, 'questionId': 3740392..."
1,16666325,2021-10-26T00:00:00.000+0200,2021-10-26T00:00:00.000+0200,True,0,"[{'answerId': 102712401, 'questionId': 3740398..."
2,16666281,2021-10-26T00:00:00.000+0200,2021-10-26T00:00:00.000+0200,True,0,"[{'answerId': 102712298, 'questionId': 3740387..."
3,16666279,2021-10-26T00:00:00.000+0200,2021-10-26T00:00:00.000+0200,True,0,"[{'answerId': 102712284, 'questionId': 3740399..."


In [62]:
# single response content
df.loc[0,'answers']

[{'answerId': 102726077,
  'questionId': 3740392,
  'answerOptions': [{'answerOptionId': 8636659,
    'sequence': 3,
    'text': '3',
    'preselected': False,
    'correct': False}]},
 {'answerId': 102726072,
  'questionId': 3740394,
  'answerOptions': [{'answerOptionId': 8636667,
    'sequence': 1,
    'text': '1 (Not at all)',
    'preselected': False,
    'correct': False}]},
 {'answerId': 102726069,
  'questionId': 3740398,
  'answerOptions': [{'answerOptionId': 8636691,
    'sequence': 2,
    'text': 'Yes and I moved less than usual',
    'preselected': False,
    'correct': False}]},
 {'answerId': 102726071,
  'questionId': 3740395,
  'answerOptions': [{'answerOptionId': 8636678,
    'sequence': 7,
    'text': '7 (I am very familiar with the music)',
    'preselected': False,
    'correct': False}]},
 {'answerId': 102726074,
  'questionId': 3740397,
  'answerOptions': [{'answerOptionId': 8636688,
    'sequence': 3,
    'text': 'Intermittently',
    'preselected': False,
    'cor