-
Notifications
You must be signed in to change notification settings - Fork 14
Processing rac COTE MicroED data
- Available in Zenodo.
- Collected on Talos Arctica G2 (200 kV)
- Ceta (indirect CMOS) detector
- Images collected using Velox software (.emd format)
- Stage tilt controlled using SerialEM
Here I use DIALS-2.1.0. Please download and install DIALS first.
To process emd files using DIALS, you need to install the dxtbx class. Please download FormatEMD.py and execute
dxtbx.install_format -g FormatEMD.pySpecify -u instead of -g if you do not have permission to DIALS installation directory.
See also here for the tips for electron diffraction data processing.
It is recommended to make a beam stop mask. You can draw polygon on the dials.image_viewer window. Here is the parameters I used.
dials.import microed_cotamd_200413/Camera_Nano_Ceta_0009_850_mm.emd
dials.generate_mask imported.expt use_trusted_range=false untrusted.polygon="0 964 950 962 986 934 1126 932 1174 972 1314 1006 1170 1056 1140 1084 1040 1098 976 1086 946 1068 938 1064 0 1070 0 964 0 964"We need to check the useful frame range for each dataset because stage tilt and detector were not synchronized. Save the following script as check_valid_ranges.py
import h5py
import numpy
import json
def get_metadata(metadata):
mds = []
for i in xrange(metadata.shape[1]):
metadata_array = metadata[:, i].T
mdata_string = metadata_array.tostring().decode("utf-8")
mds.append(json.loads(mdata_string.rstrip('\x00')))
return mds
# get_metadata()
def analyse_angle(metadata):
alphas = []
for i, md in enumerate(metadata):
alpha = numpy.rad2deg(float(md["Stage"]["AlphaTilt"]))
alphas.append(alpha)
d_alphas = numpy.diff(alphas)
if len(d_alphas) == 0: return [-1,-1], None, None
q25, q50, q75 = numpy.percentile(d_alphas, [25, 50, 75])
iqr = q75-q25
iqrc = 1.5
lowlim, highlim = q25 - iqrc*iqr, q75 + iqrc*iqr
d_alphas2 = d_alphas[numpy.where(numpy.logical_and(d_alphas>lowlim, d_alphas<highlim))] # outlier rejected
d_alpha_z = abs(d_alphas-numpy.mean(d_alphas2))/numpy.std(d_alphas2)
valid_range = [0, len(metadata)-1]
for i in xrange(len(metadata)-1):
if d_alpha_z[i] < 3: break
valid_range[0] = i+1
for i in reversed(xrange(len(metadata)-1)):
if d_alpha_z[i] < 3: break
valid_range[1] = i
if valid_range[0] > valid_range[1]:
valid_range = [0, len(metadata)-1] # reset
mean_alpha_step = (alphas[valid_range[1]] - alphas[valid_range[0]])/(valid_range[1]-valid_range[0])
return valid_range, mean_alpha_step, alphas[valid_range[0]]
# analyse_angle()
if __name__ == "__main__":
import sys
for f in sys.argv[1:]:
try:
h = h5py.File(f, "r")
image_path = h["/Data/Image"]
k = image_path.keys()[0]
metadata = get_metadata(h["/Data/Image/%s/Metadata" % k])
valid_range, mean_alpha_step, alpha_start = analyse_angle(metadata)
print f, "%s % 7.4f % 7.4f" %(",".join(map(lambda x:str(x+1), valid_range)), alpha_start, mean_alpha_step)
except:
print f, "0,0 nan nan"and run
python ./check_valid_ranges.py `find microed_cotamd_200413 -name \*.emd` > valid_ranges.txtHere is the script to process all datasets (assuming you have *.emd files in the same directory):
root=microed_cotamd_200413
mask=pixels.mask
for n in 0009 0012 0016 0019 0022 0025 0028 0031 0035 0039 0042 0046 0049 0052 0056 0059 0102 0106 0110 0113 0116 0119 0123 0126 0130 0133 0137 0140 0144 0147 0151 0154 0158 0201 0204 0208 0211 0215 0218 0222 0225 0228 0231 0
235 0238 0242 0246 0249 0252 0256 0300 0305 0307 0310 0314 0317 0321 0324 0327 0331 0335 0338 0341 0345 0349 0353 0356 0400 0403 0407 0412 0414 0418 0421 0425 0428 0432 0435 0439 0442 0446 0449 0453 0457 0500 0504 0507 0511 0
514 0518 0522 0526 0529 0533 0537 0541 0544 0548 0552 0555 0600 0602 0606 0609 0613 0616 0619 0623 0627 0629 0632 0635 0638 0641 0644 0649 0651 0654 0657 0700 0703 0705 0708 0711 0714 0717 0720 0724 0727 0730 0733 0735 0738 0
742 0745 0748 0751 0754 0805 0808 0811 0814 0818 0821 0824 0827 0831 0834 0838 0841 0844 0848 0851 0856 0858 0901 0905 0909 0912 0916 0919 0923 0926 0930 0933 1125 1129 1137 1140 1143 1147 1150 1153 1156 1200 1203 1206 1209 1
213 1216 1331 1340 1344 1347 1351 1354 1357 1401 1404 1408 1411 1415 1419 1422 1426 1429 1432 1436 1439 1442 1445 1448 1452 1455 1458 1502 1505 1508 1512 1515 1519 1522 1526 1530 1534 1537 1541 1545 1548 1551 1554 1557 1600 1
612 1616 1619 1623 1626 1630 1633 1636 1639 1643 1647 1651 1654 1657 1700 1704 1708 1711 1715 1718 1722 1726 1729 1733 1737 1740 1744 1748 1751 1754 1757 1800 1804 1807 1810 1813 1816 1820 1823 1826 1830 1834 1838 1842 1845 1
849 1853 1857 1900 1904 1908 1911 1914 1917 1920 1924 1927 1930 1934 1937 1941 1944 1947 1951 1955 1958 2002 2005 2009 2012 2016 2020 2023 2027 2030 2034 2037 2040 2043 2046 2049 2053 2057 2101 2104 2108 2112 2115 2118 2121 2
125 2129 2132 2137 2140 2143 2146 2150 2153 2156 2200 2204 2208 2211 2215 2218 2222 2225 2229 2232 2236 2240 2243 2247 2250 2254 2257 2301 2304 2308 2311 2315 2318 2322 2326 2329 2333 2336 2340 2344 2347 2351 2355 2358
do
f=Camera_Nano_Ceta_${n}_850_mm.emd
vr=`awk '/'$f'/{print $2}' ../valid_ranges.txt`
test "$vr" == "0,0" && echo "skipping $f" && continue
test -e dials_$n/imported.expt && continue
mkdir -pv dials_$n
qsub -S /bin/bash -V -wd $PWD/dials_$n -j y -pe smp 4 <<+
dials.import $root/Camera_Nano_Ceta_${n}_850_mm.emd geometry.detector.panel.origin=-29.43864,28.53116,-819.71 image_range=$vr mask=$mask goniometer.axes=1,0,0 slow_fast_beam_centre=1024,1055
DXTBX_EMD_OFFSET=200 dials.find_spots nproc=\$NSLOTS threshold.dispersion.global_threshold=200 d_max=8 imported.expt
#dials.search_beam_position imported.expt strong.refl nproc=\$NSLOTS
test -e optimised.expt && expt=optimised.expt || expt=imported.expt
orgd=\$PWD
for meth in fft1d fft3d
do
for ameth in simple local
do
cd \$orgd
suf=\${meth}_\${ameth}
dials.index \$expt strong.refl detector.fix=distance index_assignment.method=\$ameth indexing.method=\$meth max_lattices=1 \
output.experiments=indexed_\${suf}.expt output.reflections=indexed_\${suf}.refl output.log=dials.index_\${suf}.log || continue
nlatt=\`grep -c "__id__.*crystal" indexed_\${suf}.expt\`
for i in \`seq 1 \$nlatt\`
do
mkdir refine_bravais_settings_\${suf}
dials.refine_bravais_settings indexed_\${suf}.expt indexed_\${suf}.refl crystal_id=\$i detector.fix=distance output.prefix=\${i}_ output.directory=refine_bravais_settings_\${suf} log=refine_bravais_settings_\${suf}/\${i}_d
ials.refine_bravais_settings.log
done
mkdir \$suf
cd \$suf
dials.refine ../indexed_\${suf}.expt ../indexed_\${suf}.refl detector.fix=distance scan_varying=true
test -e refined.expt && prefix=refined || prefix=../indexed_\${suf}
DXTBX_EMD_OFFSET=200 dials.integrate \${prefix}.expt \${prefix}.refl nproc=\$NSLOTS profile.gaussian_rs.min_spots.per_degree=0 profile.gaussian_rs.min_spots.overall=0 || continue
mkdir exported_0
cd exported_0
dials.export ../integrated.expt ../integrated.refl format=xds_ascii
echo setting c2 | pointless DIALS.HKL > pointless.log
cd ..
for i in \`seq 0 \$((nlatt-1))\`
do
test -e exported_0/DIALS_\${i}.HKL || continue
mkdir exported_\$i
ln -s ../exported_0/DIALS_\${i}.HKL exported_\$i/DIALS.HKL
echo setting c2 | pointless exported_\$i/DIALS.HKL > exported_\$i/pointless.log
done
done
done
+
doneI decided to use results from fft1d_simple. Let us prepare files for merging.
kamo.multi_prep_merging workdir=kamo_1_fft1d_simple xdsdir=dials_\*/fft1d_simple/exported_0/It should say:
Making groups from 338 results
[ 1] 324 members:
Averaged P1 Cell= 11.95 12.27 14.42 110.82 100.67 94.80
Possible symmetries:
freq symmetry a b c alpha beta gamma reindex
324 P 1 11.95 12.27 14.42 110.82 100.67 94.80 a,b,c
0 C 1 2 1 27.02 12.27 11.95 94.80 103.62 85.70 -b-2*c,-b,-a
[ 2] 8 members:
Averaged P1 Cell= 9.13 14.91 16.03 114.24 81.21 100.85
Possible symmetries:
freq symmetry a b c alpha beta gamma reindex
8 P 1 9.13 14.91 16.03 65.76 81.21 79.15 -a,b,-c
0 C 1 2 1 29.51 9.13 16.83 91.22 119.61 96.84 -a-2*b,a,b+c
0 C 1 2 1 18.97 25.75 9.13 81.35 117.54 89.71 -a-b-c,a+b-c,a
0 C 1 2 1 26.00 16.83 9.13 88.78 101.67 85.47 -b+c,b+c,-a
0 I 2 2 2 9.13 16.83 25.75 95.01 81.35 91.22 -a,-b-c,-a-b+c
...
So I chose group 1 and P1 symmetry.
Then go to kamo_1_fft1d_simple/ directory and edit merge_blend.sh. I edited like this:
#!/bin/sh
# settings
dmin=0.8 # resolution
anomalous=false # true or false
lstin=formerge.lst # list of XDS_ASCII.HKL files
use_ramdisk=true # set false if there is few memory or few space in /tmp
# _______/setting
kamo.multi_merge \
workdir=blend_${dmin}A_framecc_b+B \
lstin=${lstin} d_min=${dmin} anomalous=${anomalous} \
space_group=None reference.data=None \
program=xscale xscale.reference=bmin xscale.degrees_per_batch=1 \
reject_method=framecc+lpstats rejection.lpstats.stats=em.b+bfactor \
clustering=blend blend.min_cmpl=80 blend.min_redun=8 blend.max_LCV=None blend.max_aLCV=None \
max_clusters=None xscale.use_tmpdir_if_available=${use_ramdisk} \
batch.engine=sge batch.par_run=merging batch.nproc_each=8 nproc=8 batch.sge_pe_name=smpI took the second largest cluster based on the CC1/2. Here is the snippet of cluster_summary.dat:
cluster ClH LCV aLCV run ds.all ds.used Cmpl Redun I/sigI Rmeas CC1/2 Cmpl.ou Red.ou I/sig.ou Rmeas.ou CC1/2.ou Cmpl.in Red.in I/sig.in Rmeas.in CC1/2.in SigAno.in CCano.in WilsonB Aniso.bst Aniso.wst dmin.est
cluster_0323 101.72 4.7 0.7 1 324 324 95.9 205.2 29.89 55.3 99.5 90.7 204.1 0.00 -99.9 11.6 97.2 187.8 175.64 7.3 100.0 1.2 -23.0 3.27 0.79 1.65 0.87
cluster_0323 101.72 4.7 0.7 2 324 324 94.0 183.2 30.32 34.0 99.5 89.1 180.9 3.15 847.1 9.6 95.1 171.5 175.33 6.6 99.9 1.1 -21.0 3.20 0.79 2.15 0.84
cluster_0323 101.72 4.4 0.7 3 324 279 96.0 166.6 27.02 57.7 98.9 96.4 166.2 1.67 882.0 40.5 93.7 154.6 174.55 6.3 100.0 1.1 -23.0 1.21 0.63 2.14 0.77
cluster_0322 61.64 3.3 0.5 1 236 236 96.5 149.4 24.51 59.4 99.5 92.2 149.0 0.00 -99.9 22.4 97.2 137.9 143.68 7.5 99.9 1.1 -19.0 3.23 0.84 7.12 0.87
cluster_0322 61.64 3.3 0.5 2 236 236 97.6 133.0 23.72 36.2 99.5 97.5 132.3 1.73 1095.2 23.3 95.5 125.7 144.13 6.8 99.9 1.1 -22.0 3.15 0.75 2.07 0.82
cluster_0322 61.64 3.3 0.5 3 236 202 96.5 119.4 22.08 53.3 99.0 96.8 119.1 1.35 1061.2 52.5 94.4 111.7 141.02 6.5 100.0 1.1 -20.0 1.60 0.61 4.67 0.78
cluster_0321 49.56 3.1 0.5 1 162 162 98.1 100.7 19.01 64.1 98.3 96.4 100.6 0.00 -99.9 3.4 97.2 93.0 115.72 7.6 99.9 1.1 -22.0 3.25 0.82 1.77 0.87
cluster_0321 49.56 3.1 0.5 2 162 162 97.0 89.7 18.96 37.8 99.5 95.6 89.3 1.25 1527.7 3.1 93.7 86.3 119.97 6.9 99.9 1.0 -25.0 3.29 0.80 1.74 0.85
cluster_0321 49.56 2.8 0.5 3 162 140 95.3 82.3 17.86 58.4 98.5 95.5 82.2 0.96 1598.7 17.7 93.0 77.4 119.27 6.6 99.9 1.1 -23.0 1.63 0.65 1.71 0.80
...
Here is the statistics from blend_0.8A_framecc_b+B/cluster_0322/run_03/.
UNIT_CELL_CONSTANTS= 11.95 12.27 14.41 110.734 100.784 94.739
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr
2.40 30150 270 286 94.4% 6.4% 5.2% 30147 141.02 6.5% 100.0* -20 1.123 256
1.70 60819 514 536 95.9% 12.2% 9.7% 60811 82.10 12.3% 99.8* -28 0.910 491
1.39 79594 659 681 96.8% 21.6% 20.5% 79588 44.19 21.7% 99.6* -11 0.849 635
1.20 93670 785 818 96.0% 34.6% 38.7% 93663 27.14 34.7% 97.6* 5 0.801 755
1.07 113132 949 982 96.6% 44.0% 55.8% 113118 19.66 44.2% 97.0* 8 0.754 909
0.98 114970 961 994 96.7% 95.4% 143.6% 114955 8.55 95.8% 90.1* 8 0.617 917
0.91 123713 1022 1057 96.7% 183.4% 296.3% 123704 4.66 184.3% 68.7* 7 0.551 984
0.85 139906 1167 1204 96.9% 315.5% 528.2% 139884 2.89 316.9% 49.0* 6 0.485 1111
0.80 149358 1254 1295 96.8% 1056.3% 1792.5% 149338 1.35 1061.2% 52.5* 10 0.430 1203
total 905312 7581 7853 96.5% 53.1% 77.2% 905208 22.08 53.3% 99.0* 1 0.651 7261
Anisotropy:
direction B_eigen Resol(CC1/2=0.5)
0.91a*-0.39b*-0.16c* 0.99 0.61
0.30a*+0.86b*-0.41c* 0.50 0.77
0.06a*-0.07b*+c* 4.10 4.67
Note that actual space group is P-1.