# The General object structure of MDAanlysis

There are two funcamental classes of MDAnalysis: 
- `Universe`
- `AtomGroup`

A particle is represented as an `Atom` object and multiple particle are grouped together in `AtomGroup` class.

First we create a `Universe` data structure. It stores all the atoms of our system in `AtomGroup` class.

In [1]:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD, GRO, XTC
import  matplotlib.pyplot as plt

we create our universe using topology file, we can also add trejectory file in second argument but it is optional.

In [10]:
psf_uni = mda.Universe(PSF)
print(psf_uni)

<Universe with 3341 atoms>


A **topology file** is always required for loading data into `Universe`. MDAnalysis accepts PSF, PDB, CRD, and GRO formats.

In [11]:
psf_uni.trajectory 

AttributeError: No trajectory loaded into Universe

An error came when we tried to look trajectory info because we did not load any trajectory file and psf file does not contain coordinate information.

If a file does contain coordinate information then we will have a trajectory of 1 frame.

In [12]:
# Let's use GRO file

gro_uni = mda.Universe(GRO)
print(gro_uni)

<Universe with 47681 atoms>


In [16]:
gro_uni.trajectory

<GROReader /home/sujaly/anaconda3/envs/MDAnalysis/lib/python3.13/site-packages/MDAnalysisTests/data/adk_oplsaa.gro with 1 frames of 47681 atoms>

In [17]:
type(gro_uni.trajectory)

MDAnalysis.coordinates.GRO.GROReader

In [18]:
len(gro_uni.trajectory)

1

In [19]:
# universe with trajectory

uni_traj = mda.Universe(PSF, DCD)
print(uni_traj)
print(len(uni_traj.trajectory))

<Universe with 3341 atoms>
98


# Working with groups of atoms

We have three classes to work with:
- `AtomGroup`: Can be access through `.atoms` attribute of `Universe`
- `ResidueGroup`: Can be access through `.residues` attribute of `Universe`
- `SegmentGroup`: Can be access through `.segments` attribute po `Universe`

In [20]:
uni_traj.atoms

<AtomGroup with 3341 atoms>

In [21]:
uni_traj.residues

<ResidueGroup with 214 residues>

In [22]:
uni_traj.segments

<SegmentGroup with 1 segment>

These groups can be thought as list, so you can select using slicing

In [29]:
last_five_atoms = uni_traj.atoms[-5:]
print(last_five_atoms)
print("atoms:", last_five_atoms.names)
print("residue of the atoms:", last_five_atoms.resnames)
print("segment id of the atoms:", last_five_atoms.segids)

<AtomGroup [<Atom 3337: HA1 of type 6 of resname GLY, resid 214 and segid 4AKE>, <Atom 3338: HA2 of type 6 of resname GLY, resid 214 and segid 4AKE>, <Atom 3339: C of type 32 of resname GLY, resid 214 and segid 4AKE>, <Atom 3340: OT1 of type 72 of resname GLY, resid 214 and segid 4AKE>, <Atom 3341: OT2 of type 72 of resname GLY, resid 214 and segid 4AKE>]>
atoms: ['HA1' 'HA2' 'C' 'OT1' 'OT2']
residue of the atoms: ['GLY' 'GLY' 'GLY' 'GLY' 'GLY']
segment id of the atoms: ['4AKE' '4AKE' '4AKE' '4AKE' '4AKE']


it also has a powerful atom selection language. This is available with the `.select_atoms()` method of `AtomGroup` or `Universe` instance

In [36]:
print(uni_traj.select_atoms("resname ASP or resname GLU"))
print(uni_traj.select_atoms("resname ASP or resname GLU").resnames[:50]) #first 50

<AtomGroup [<Atom 318: N of type 54 of resname GLU, resid 22 and segid 4AKE>, <Atom 319: HN of type 1 of resname GLU, resid 22 and segid 4AKE>, <Atom 320: CA of type 22 of resname GLU, resid 22 and segid 4AKE>, ..., <Atom 3271: OE2 of type 72 of resname GLU, resid 210 and segid 4AKE>, <Atom 3272: C of type 20 of resname GLU, resid 210 and segid 4AKE>, <Atom 3273: O of type 70 of resname GLU, resid 210 and segid 4AKE>]>
['GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU'
 'GLU' 'GLU' 'GLU' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP'
 'ASP' 'ASP' 'ASP' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU'
 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'GLU' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP' 'ASP'
 'ASP' 'ASP']


In [38]:
print(uni_traj.select_atoms("resid 50-100").n_residues) # prints number of residues using .n_residues
print(uni_traj.residues[50:100].n_residues) # this does not include last residue

51
50


In [40]:
#Glutamic acid is typically named “GLU”, but histidine can be named “HIS”, “HSD”, or “HSE” depending on its protonation state and the force field used.
uni_traj.select_atoms("(resname GLU or resname HS*) and name CA and (resid 1:100)")


<AtomGroup with 6 atoms>

## Note:

An `AtomGroup` created from a selection is sorted and duplicate elements are removed. This is not true for an `AtomGroup` produced by slicing. Thus, slicing can be used when the order of atoms is crucial.

# Getting atom information from AtomGroups

In [41]:
print(uni_traj.atoms[:20].names)

['N' 'HT1' 'HT2' 'HT3' 'CA' 'HA' 'CB' 'HB1' 'HB2' 'CG' 'HG1' 'HG2' 'SD'
 'CE' 'HE1' 'HE2' 'HE3' 'C' 'O' 'N']


In [42]:
print(uni_traj.atoms[:20].masses)


[14.007  1.008  1.008  1.008 12.011  1.008 12.011  1.008  1.008 12.011
  1.008  1.008 32.06  12.011  1.008  1.008  1.008 12.011 15.999 14.007]


In [43]:
print(uni_traj.atoms[:20].residues)
print(uni_traj.atoms[:20].segments)

<ResidueGroup [<Residue MET, 1>, <Residue ARG, 2>]>
<SegmentGroup [<Segment 4AKE>]>


See we do not have dupilcates in residues, if i want to know residue atom wise I can access them using `.resnames`

In [44]:
print(uni_traj.atoms[:20].resnames)

['MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET'
 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'MET' 'ARG']


In [47]:
near_met = uni_traj.select_atoms("not resname MET and (around 2 resname MET)")
sorted(near_met.groupby(['resnames', 'names']))

[('ALA', 'C'),
 ('ALA', 'HN'),
 ('ARG', 'N'),
 ('ASN', 'O'),
 ('ASP', 'C'),
 ('ASP', 'N'),
 ('GLN', 'C'),
 ('GLU', 'N'),
 ('ILE', 'C'),
 ('LEU', 'N'),
 ('LYS', 'N'),
 ('THR', 'N')]

# AtomGroup positions and methods

In [50]:
ca = uni_traj.select_atoms("resid 1-5 and name CA")
print(ca.positions)
print(type(ca.positions))
print(ca.positions.shape)

[[11.664622    8.393473   -8.983231  ]
 [11.414839    5.4344215  -6.5134845 ]
 [ 8.959755    5.612923   -3.6132305 ]
 [ 8.290068    3.075991   -0.79665166]
 [ 5.011126    3.7638984   1.130355  ]]
<class 'numpy.ndarray'>
(5, 3)


In [52]:
print(ca.center_of_mass())

[ 9.06808195  5.25614133 -3.75524844]
