# Working with the ProLint `Universe` object

In [11]:
from prolint2 import Universe
from MDAnalysis import Universe as MDUniverse
from prolint2.sampledata import GIRK

ProLint can read data directly from another `Universe` object. If you already have a pipeline that uses MDAnalysis to read the data, you can easily switch to ProLint by replacing the `Universe` object with a `ProLint.Universe` object or by directly reading the data from the MDAnalysis.

Below we create an MDAnalysis `Universe` instance and define a custom `query` and `database`.

In [12]:
# Use MDAnalysis to create a Universe instance
mda_u = MDUniverse(GIRK.coordinates, GIRK.trajectory)
mda_u_query = mda_u.select_atoms('protein and name BB')
mda_u_db = mda_u.select_atoms('resname POPE')

We can now use the MDAnalysis Universe instance to create a ProLint Universe instance:

In [13]:
# Use `mda_u` to create a ProLint Universe instance
u = Universe(universe=mda_u)

We can also directly use the `query` and `database` information to create a ProLint Universe instance:

In [14]:
# u = Universe(universe=mda_u, query=mda_u_query, db=mda_u_db)

### Accessing the query and database AtomGroups

In [15]:
u.query, u.database

(<ProLint Wrapper for <AtomGroup with 2956 atoms>>,
 <ProLint Wrapper for <AtomGroup with 20864 atoms>>)

Notice how both the `query` and `database` AtomGroups are ProLint wrappers around the MDAnalysis AtomGroups. This means that you get to keep all the functionality of MDAnalysis AtomGroups, but you can also use the ProLint-specific functions, such as making changes to the `query` and `database` AtomGroups or accessing ProLint-specific attributes.

In [16]:
u.database.unique_resnames, u.database.get_resnames([2345, 2346, 3050]), u.database.filter_resids_by_resname([2345, 2346, 3050], 'CHOL')

(array(['CHOL', 'POPE', 'POPS'], dtype=object),
 ['POPE', 'POPE', 'CHOL'],
 array([3050]))

In [17]:
u.query.get_resnames([1, 2, 3, 4, 5], out=dict)

{1: 'ARG', 2: 'GLN', 3: 'ARG', 4: 'TYR', 5: 'MET'}

### Modifying the query and database AtomGroups

#### `Remove` from ProLint AtomGroup

In [18]:
# Remove all residues with resname 'ARG' from the query
s = u.query.remove(resname='ARG')
u.query.n_atoms, s.n_atoms

(2956, 2764)

In [19]:
# Remove all residues with resname 'ARG' and all residue numbers lower than 100
s = u.query.remove(resname='ARG', resnum=[*range(100)])
u.query.n_atoms, s.n_atoms

(2956, 2532)

In [20]:
# More complicated example: 
# Remove all residues with resname 'ARG' and the residue number 1, and all atoms with the name 'BB' and the atomids 1-9
s = u.query.remove(resname='ARG', resnum=[1], atomname=['BB'], atomids=[1, 2, 3, 4, 5, 6, 7, 8, 9])
u.query.n_atoms, s.n_atoms

(2956, 1543)

`Important`: 
1. `remove` combines all input arguments into a single selection string concatenated with `or` statements.
2. The above code returns a new ProLint AtomGroup, but does not modify the original AtomGroup. To modify the original AtomGroup, you need to use assignment. See below.

In [21]:
u.query = u.query.remove(resname='ARG')
u.query.n_atoms

2764

In [22]:
# Let's compute the contacts between the modified query and the database
c = u.compute_contacts(cutoff=7)

100%|██████████| 13/13 [00:00<00:00, 307.78it/s]




In [23]:
# Number of residues we have computed contacts for
len(c.contact_frames.keys())

419

#### `Add` to ProLint AtomGroup

In [24]:
# Let's add back the residues we removed from the query
u.query = u.query.add(resname='ARG')
u.query.n_atoms

2956

In [25]:
# Let's compute the contacts between the query and the database
c = u.compute_contacts(cutoff=7)

100%|██████████| 13/13 [00:00<00:00, 375.47it/s]


In [26]:
# Number of residues we have computed contacts for
len(c.contact_frames.keys())

442