# Prepare Electron Microscopy Data

The powerfit program requires a EM density of a unknown structure where it can fit structures into.

Most of the the entries on EMDB contains multiple structures, so we want to pretent that we do not know one of structures.

For this example we will use [EMD-33292](https://www.ebi.ac.uk/emdb/EMD-33292), a sodium channel, with fitted model [7xm9](https://www.ebi.ac.uk/pdbe/entry/pdb/7xm9).

The fitted model consist of following chains:
- A: Isoform 3 of Sodium channel protein type 9 subunit alpha
- B: Sodium channel subunit beta-1
- C: Sodium channel subunit beta-2

We will use the B chain, the sodium channel subunit beta-1, as the unknown structure.

Lets start by downloading the density map and the fitted model.

In [1]:
!wget -nc https://ftp.ebi.ac.uk/pub/databases/emdb/structures/EMD-33292/map/emd_33292.map.gz
!gunzip -kf emd_33292.map.gz
!wget -nc https://www.ebi.ac.uk/pdbe/entry-files/download/7xm9.cif

--2025-07-10 15:24:27--  https://ftp.ebi.ac.uk/pub/databases/emdb/structures/EMD-33292/map/emd_33292.map.gz
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.165
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.165|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62515467 (60M) [application/x-gzip]
Saving to: ‘emd_33292.map.gz’


2025-07-10 15:24:28 (75.4 MB/s) - ‘emd_33292.map.gz’ saved [62515467/62515467]

File ‘7xm9.cif’ already there; not retrieving.



## Prepare density using Chimerax

Let us use [ChimeraX](https://www.rbvi.ucsf.edu/chimerax/) to prepare the EM density.


Write a ChimeraX command script (.cxc) to mask everything but the B chain in the denstiy map from EMDB.

In [None]:
from pathlib import Path

in_density = Path("emd_33292.map")
pdb = Path("7xm9.cif")
unknown_chain = "B"
resolution = 3.48
masked_density = Path(f"{in_density.stem}-{pdb.stem}-{unknown_chain}-{resolution}.mrc")
script = masked_density.with_suffix(".cxc")
script.write_text(f"""\
open {in_density};
open {pdb};
delete #2/{unknown_chain};
molmap #2 {resolution};
volume mask #1 surfaces #3 invertMask true;
save {masked_density} #4;
exit
""")

148

In [2]:
# Because sodium channel is in the membrane the EM density has a donut of small density around the protein.
# This make fitting slow, because powerfit has to try to fit the template in the membrane density as well.
# We can cheat by taking just density of chain B with a bit of padding around it.
# This works when you want to try powerfit on a fitted model, but if you do not know the structure you want to fit, you can not do this.
from pathlib import Path

in_density = Path("emd_33292.map")
pdb = Path("7xm9.cif")
unknown_chain = "B"
resolution = 3.48
masked_density = Path(f"{in_density.stem}-{pdb.stem}-{unknown_chain}-{resolution}.cheated.mrc")

script = masked_density.with_suffix(".cxc")
script.write_text(f"""\
open {in_density};
open {pdb};
molmap #2/{unknown_chain} {resolution} balls true;
volume mask #1 surfaces #3 pad 4;
save {masked_density} #4;
exit
""")

146

In [3]:
!chimerax --nogui --script $script

[32mavailable bundle cache has not been initialized yet
[37m[32mExecuting: runscript emd_33292-7xm9-B-3.48.cheated.cxc
[37m[32mExecuting: open emd_33292.map
[37m[35mComputing emd_33292.map surface, level 0.293[37m
[35mCalculated emd_33292.map surface, level 0.293, with 805008 triangles[37m
[32mOpened emd_33292.map as #1, grid size 256,256,256, pixel 1.04, shown at level 0.293, step 1, values float32
[37m[32mExecuting: open 7xm9.cif
[37m[35mFetching CCD 6OU, 0.0195 of 0.0195 Mbytes received[37m
[35mFetching CCD G2E, 0.00888 of 0.00888 Mbytes received[37m
[35mFetching CCD NAG, 0.0108 of 0.0108 Mbytes received[37m
[32mSummary of feedback from opening 7xm9.cif  
---  
_notes_ | Fetching CCD 6OU from https://files.wwpdb.org/pub/pdb/refdata/chem_comp/U/6OU/6OU.cif  
Fetching CCD G2E from
https://files.wwpdb.org/pub/pdb/refdata/chem_comp/E/G2E/G2E.cif  
Fetching CCD NAG from
https://files.wwpdb.org/pub/pdb/refdata/chem_comp/G/NAG/NAG.cif  
  

[37m[32m_7xm9.cif_ title: 

In [12]:
# Clean after ourselves
script.unlink()

In [4]:
masked_density, pdb

(PosixPath('emd_33292-7xm9-B-3.48.cheated.mrc'), PosixPath('7xm9.cif'))

Open the `masked_density` file in a viewer to verify that densities of chain A and C from `pdb` file have been removed.

To check that the created density can be used by powerfit we can run powerfit with it.

## Create a re-oriented template structure

We could just fit chain B from 7xm9.cif, but as it is already in the correct position and orientation this feels a bit like cheating.
we can make a new template structure that has incorrect position and orientation.

Let us use the [atomium](https://atomium.bio/) to apply a rotation and translation to the B chain of 7xm9.cif.


In [None]:
from typing import cast

import atomium
import atomium.data

p = cast("atomium.data.File", atomium.open(str(pdb)))
m = cast("atomium.Model", p.model)
chains = cast("set[atomium.Chain]", m.chains())
chain = cast("atomium.Chain", [c for c in list(chains) if c.id == unknown_chain])
chain.rotate(0.5, "x")
chain.rotate(0.1, "y")
chain.rotate(0.2, "z")
chain.translate(40, 50, 60)
template = pdb.with_suffix(".B-reoriented.pdb")
chain.save(str(template))

## Fit with powerfit

In [6]:
powerfit_result_dir = Path(f"powerfit-{masked_density.stem}")

In [9]:
!powerfit $masked_density $resolution $template -d $powerfit_result_dir --laplace --delimiter , -p 1

Target file read from:                                                          
[35m/home/stefanv/git/protein-detective/protein-detective/docs/[0m[95memd_33292-7xm9-B-3.48[0m
[95m.cheated.mrc[0m                                                                    
Target resolution: [1;36m3.48[0m                                                         
Initial shape of density: [1;36m99[0m [1;36m71[0m [1;36m61[0m                                              
Shape after trimming: [1;36m94[0m [1;36m71[0m [1;36m59[0m                                                  
Shape after extending: [1;36m96[0m [1;36m72[0m [1;36m60[0m                                                 
Template file read from:                                                        
[35m/home/stefanv/git/protein-detective/protein-detective/docs/[0m[95m7xm9.B-reoriented.pdb[0m
Reading in rotations.                                                           
Requested rotational sampling densi

In [10]:
powerfit_results = powerfit_result_dir / "solutions.out"

In [11]:
!head $powerfit_results

rank,cc,Fish-z,rel-z,x,y,z,a11,a12,a13,a21,a22,a23,a31,a32,a33
1,0.228,0.232,15.399,123.760,101.920,169.520,0.978,0.176,-0.109,-0.109,0.886,0.450,0.176,-0.429,0.886


# Visualize fit

![Mol* screenshot of best fit and masked density, optimized using https://squoosh.app/](./em-prepare.jpg)

In [None]:
# To visualize interactively run this cell
from protein_detective.visualization import show_structure_and_density

best_fit = powerfit_result_dir / "fit_1.pdb"
show_structure_and_density(best_fit, masked_density)