### Notebook 1: Basic Queries
This notebook offers an introduction to the main function responsible for querying the API and the kinds of data available for analysis. It then leads a reader towards other notebooks based on particular interests and characterization specialties. Two classes have been created, one for querying at the library level and another for querying at the sample level. They are laid out as follows:

##### Library Class
The library class contains four important functions:

Library.search_by_ids(ids_list): This is a static function within the library class. It takes a list of library numbers and returns a list of objects associated with each of the libraries, which can be queried as its own instance of the library class.

Library.search_by_composition(only=[],not_including=[],any_of=[]): This is also a static function within the library class. It takes a list of elements within each of the three lists, then returns a list of objects associated with the libraries that have that specific combination of elements. The "only" list allows a user to specify which elements are required to be in a sample, the "not_including" list allows a user to specify which elements are not allowed in a sample, and the "any_of" list allows a user to specify elements which may be in a sample, but for which it is not necessary for all of them to be so.

Library.properties(self): This function returns all relevant properties data about a library within a pandas DataFrame.

Library.spectra(self,which): This function returns either the optical spectra or the x-ray diffraction spectra for all samples in a library, depending on the value of 'which'. The variable 'which' can be set to 'xrd' to get x-ray diffraction spectra or 'optical' to get the ultraviolet reflectance, ultraviolet transmittance, near-infrared reflectance, and the near-infrared transmittance. This data is returned in a pandas DataFrame.
##### Sample Class
The sample class contains four important functions:

Sample.search_by_ids(ids_list): This is a static function within the library class. It takes a list of sample numbers and returns a list of objects associated with each of the samples, which can be queried as its own instance of the library class.

Sample.properties(self): This function returns all relevent properties data about a sample within a pandas DataFrame.

Sample.spectra(self,which): This function returns either the optical spectra or the x-ray diffraction spectra for a particular sample, depending on the value of 'which'. The variable 'which' can be set to 'xrd' to get x-ray diffraction spectra or 'optical' to get the ultraviolet reflectance, ultraviolet transmittance, near-infrared reflectance, and the near-infrared transmittance. This data is returned in a pandas DataFrame.

In [1]:
import sys
import pandas as pd
sys.path.append('../lib')
#Note: When working in Windows environments, use:
#sys.path.append('..\lib')
from library import Library
from sample import Sample
import seaborn as sns
color = sns.color_palette()

%matplotlib inline

Above one sees that the proper modules have now been imported, including the Library and Sample classes discussed above. A brief example is now shown for each of these class functions.

Below is an example of Library.search_by_ids([ids_list]). The result after querying from the list of samples is a list of objects, which are then called within the "for" loop. Using the Library.properties(), we get back a pandas DataFrame. From this DataFrame, we query information for each of the samples: the computer given id, the PDAC number (that is, the chamber it was made in), the number given to the sample by the researcher, and the elements listed as being a part of the sample.

In [2]:
for lib in Library.search_by_ids([7387,10295,7494,7269]):
    print(lib.properties()[['id','pdac','num','elements']])

     id pdac  num     elements
0  7387    4  399  [Cu, S, Sn]
      id pdac   num         elements
0  10295    1  1161  [In, O, Sn, Zn]
     id pdac   num      elements
0  7494    1  1394  [Ta, Sn, Co]
     id pdac   num         elements
0  7269    1  1211  [Co, O, Ni, Zn]


Suppose we wish to know some basic information about all the samples that contain a certain series of elements. In the example below, we can use the Library.search_by_composition function to look at information for all samples that have titanium, zinc, oxygen, and tin in them (and as an example, we want to ensure that there is no hydrogen present in them). We find four samples, which we can further explore if we so choose.

In [3]:
for lib in Library.search_by_composition(only = ['Ti','Zn','O','Sn'], not_including = ['H']):
    print(lib.properties()[['id','elements','pdac','num']])

     id         elements pdac   num
0  7909  [Sn, O, Zn, Ti]    1  1896
      id         elements pdac   num
0  10137  [Sn, O, Zn, Ti]    1  2028
     id         elements pdac   num
0  9867  [Sn, O, Zn, Ti]    1  1952
     id         elements pdac   num
0  7707  [Sn, O, Zn, Ti]    1  1894
     id         elements pdac   num
0  7997  [Sn, O, Zn, Ti]    1  1895
     id         elements pdac   num
0  9918  [Sn, O, Zn, Ti]    1  1951
     id         elements pdac   num
0  7988  [Sn, O, Zn, Ti]    1  1897
     id         elements pdac   num
0  7887  [Sn, O, Zn, Ti]    1  1898


Suppose we want to know everything there is to know about a certain library, including information like the deposition time, the deposition power, etc. We can see all of this within a single pandas DataFrame using the Library.properties() function. To narrow this down, one may look at just certain columns of the pandas DataFrame (as shown above).

In [4]:
Library(7387).properties()

Unnamed: 0,sputter_operator,deposition_gas_flow_sccm,deposition_ts_distance,owner_email,deposition_sample_time_min,xrf_compounds,num,deposition_substrate_material,deposition_compounds,deposition_metadata,...,pdac,deposition_power,deposition_base_pressure_mtorr,data_access,has_ele,deposition_gases,deposition_growth_pressure_mtorr,deposition_target_pulses,deposition_rep_rate,has_opt
0,,,,l.l.baranowski@gmail.com,240,,399,,"[Cu2S, SnS2, None]",,...,4,"[50, 35, None]",,public,0,,3,"[None, None, None]",,0


Now we can also query the spectra for different libraries, however this usually results in quite a bit of data. The function Library.spectra(self,which) will return the full x-ray diffraction spectrum (which = 'xrd') for each sample or the full optical spectrum (which = 'optical') for each sample. Take note, however, that these commands access a substantial amount of data and are therefore prone to running a bit slower.

In [5]:
Library(7387).spectra(which='xrd')

Unnamed: 0,xrd_angle_1,xrd_background_1,xrd_intensity_1,xrd_angle_2,xrd_background_2,xrd_intensity_2,xrd_angle_3,xrd_background_3,xrd_intensity_3,xrd_angle_4,...,xrd_intensity_41,xrd_angle_42,xrd_background_42,xrd_intensity_42,xrd_angle_43,xrd_background_43,xrd_intensity_43,xrd_angle_44,xrd_background_44,xrd_intensity_44
0,19.00,25563.558594,25563.558594,19.00,25057.816406,25057.816406,19.00,25980.867188,25980.867188,19.00,...,18174.898438,19.00,18932.871094,18932.871094,19.00,18931.644531,18931.644531,19.00,21889.720703,21889.720703
1,19.05,26023.094075,26163.195312,19.05,25573.907226,25626.003906,19.05,26063.926269,26297.552734,19.05,...,18383.714844,19.05,19430.222005,19543.777344,19.05,18902.313802,18831.302734,19.05,22136.890625,22062.320312
2,19.10,26243.066298,26070.960938,19.10,25721.741943,25672.974609,19.10,26047.262966,25792.748047,19.10,...,18426.494141,19.10,19600.899577,19528.498047,19.10,19128.110677,18943.994141,19.10,22359.730468,22458.630859
3,19.15,26589.720884,26314.277344,19.15,25904.824137,26118.734375,19.15,26232.246763,25875.626953,19.15,...,18832.458984,19.15,19805.197356,19794.083984,19.15,19415.091218,19479.394531,19.15,22597.772280,22463.779297
4,19.20,26848.283137,27184.642578,19.20,26090.337410,26166.630859,19.20,26485.411856,26599.285156,19.20,...,18977.664062,19.20,20036.221463,20180.960938,19.20,19681.624783,19676.998047,19.20,22856.023663,22825.300781
5,19.25,27074.674978,27631.363281,19.25,26288.102855,26286.468750,19.25,26734.147464,26982.212891,19.25,...,19189.962891,19.25,20259.568304,20006.964844,19.25,19912.276157,19971.642578,19.25,23076.414953,23171.726562
6,19.30,27290.485997,27721.039062,19.30,26502.595716,26219.134766,19.30,26961.453919,27149.376953,19.30,...,19452.931641,19.30,20468.538943,20430.046875,19.30,20126.913062,20512.337891,19.30,23297.352187,23409.023438
7,19.35,27508.240592,27905.333984,19.35,26725.402611,26599.638672,19.35,27185.663449,27102.310547,19.35,...,19806.570312,19.35,20674.789917,21228.537109,19.35,20339.777640,20636.941406,19.35,23527.729109,24167.662109
8,19.40,27729.502283,28026.691406,19.40,26951.854328,27335.708984,19.40,27413.339235,27874.187500,19.40,...,19758.742188,19.40,20887.872504,21154.195312,19.40,20559.803809,20784.466797,19.40,23756.269576,23957.716797
9,19.45,27959.632244,28109.275391,19.45,27180.492786,27323.460938,19.45,27642.071093,27600.697266,19.45,...,19769.851562,19.45,21112.117708,21321.046875,19.45,20782.560226,20714.238281,19.45,23985.673184,24083.224609


In [6]:
Library(8307).spectra(which='optical')

TypeError: 'NoneType' object has no attribute '__getitem__'

Many of the same techniques used on an entire 44-sample library may also be used on a single sample. Data may be queried just as before, however the information will be specific to a sample instead of a library. Below is an example of the Sample.search_by_ids(id_list) function, which returns a list of objects for each position.

In [None]:
for lib in Sample.search_by_ids([300999,311733,213789]):
    print(lib.properties()[['sample_id','xrf_compounds','xrf_concentration','thickness']])

The code segment above also makes use of the Sample.properties(self) function. Just as with the Library class, this returns all information relevant to this particular sample, formatted within a pandas DataFrame.

In [None]:
Sample(311733).properties()

In the same way that one queries the spectra for an entire library, one can just as easily query a single sample for either x-ray diffraction or optical spectra. Note that the near-infrared spectra within the optical DataFrames are significantly shorter, so the result is that the DataFrame gets padded with Null values within the column.

In [None]:
Sample(213789).spectra('xrd')

In [None]:
Sample(300999).spectra('optical')

This concludes the explanation of the Python classes used to query data from the API.