# Filter By Experimental Methods Demo

Example how to filter PDB entries by experimental methods.


[To learn more about experimental methods](http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/methods-for-determining-structure)


## Imports

In [1]:
from pyspark import SparkConf, SparkContext
from mmtfPyspark.io import MmtfReader
from mmtfPyspark.filters import experimentalMethods

## Configure Spark

In [2]:
conf = SparkConf().setMaster("local[*]") \
                      .setAppName("FilterByExperimentalMethods")
sc = SparkContext(conf = conf)

## Read in MMTF Files

In [3]:
path = "../../resources/mmtf_reduced_sample/"

pdb = MmtfReader.readSequenceFile(path, sc)

## Filter by experimental methods

#### List of supported experimental methods

* experimentalMethods.ELECTRON_CRYSTALLOGRAPHY
* experimentalMethods.ELECTRON_MICROSCOPY
* experimentalMethods.ERP
* experimentalMethods.FIBER_DIFFRACTION
* experimentalMethods.FLUORESCENCE_TRANSFER
* experimentalMethods.INFRARED_SPECTROSCOPY
* experimentalMethods.NEUTRON_DIFFRACTION
* experimentalMethods.POWDER_DIFFRACTION
* experimentalMethods.SOLID_STATE_NMR
* experimentalMethods.SOLUTION_NMR
* experimentalMethods.SOLUTION_SCATTERING
* experimentalMethods.THEORETICAL_MODEL
* experimentalMethods.X_RAY_DIFFRACTION

In [7]:
pdb = pdb.filter(experimentalMethods(experimentalMethods.NEUTRON_DIFFRACTION, experimentalMethods.X_RAY_DIFFRACTION))

## Print out entries

In [11]:
filtered_proteins = pdb.keys().collect()

print(filtered_proteins)

['5CCE', '5JPC', '5K1Z', '5E5J', '5E5K', '5EBJ', '4PVM', '4PVN', '5C6E', '5C8I', '3VXF', '4JEC', '5A93', '5TKI', '5WEY', '5CCD', '5CE4', '3QZA', '4CVI', '5MON', '5MOO', '5DPN', '5CG5', '5CG6', '3QBA', '5KWF', '4GPG', '3KYX', '3KYY', '3L45', '5XPE', '4CVJ', '4LNC', '3R98', '3R99', '4DVO', '4QXK', '3X2O', '3X2P', '3BYC', '4NY6', '3HGN', '5T8H', '3INS', '4QCD', '4QDP', '4QDW', '3TMJ', '4S2D', '4S2F', '4S2G', '4S2H', '3OTJ', '3KCJ', '3KCL', '3KCO', '2R24', '5JPR', '4PDJ', '4XPV', '5PTI', '5RSA', '4N3M', '4N9M']


## Terminate Spark 

In [12]:
sc.stop()