# Filter Exclusively By L Protein Demo

Simple example of reading an MMTF Hadoop Sequence file, filtering the entries exclusively by LProtein, and counting the number of entries. This example shows how methods can be chained for a more concise syntax.

## Imports

In [1]:
from pyspark.sql import SparkSession
from mmtfPyspark.io import mmtfReader
from mmtfPyspark.filters import ContainsLProteinChain
from mmtfPyspark.structureViewer import view_structure

#### Configure Spark 

In [2]:
spark = SparkSession.builder.appName("FilterExclusivelyByLProteinDemo").getOrCreate()

## Read in MMTF Files, filter by L protein, and count the entries

In [3]:
path =  "../../resources/mmtf_reduced_sample/"

structures = mmtfReader.read_sequence_file(path) \
                .filter(ContainsLProteinChain(exclusive = True))

print(f"Number of L-Proteins: {structures.count()}")

Number of L-Proteins: 9316


## Visualize Structures

In [4]:
structure_names = structures.keys().collect()
view_structure(structure_names, style='sphere');

interactive(children=(IntSlider(value=0, continuous_update=False, description='Structure', max=9315), Output()â€¦

## Terminate Spark 

In [5]:
spark.stop()