-
Notifications
You must be signed in to change notification settings - Fork 14
Processing AtDTX14 data (5Y50)
The following describes how A. thaliana DTX14 (AtDTX14) datasets can be processed using KAMO (documentation in Japanese / English).
- Original paper
- Miyauchi et al. (2017) "Structural basis for xenobiotic extrusion by eukaryotic MATE transporter." Nature Communications doi: 10.1038/s41467-017-01541-0 PDB: 5Y50
- Available in Zenodo.
- Collected on BL32XU, SPring-8
- MX225HS CCD detector (2x2 binning), 10×10 or 18×10 μm2 beam, 1 Å wavelength
- 5, 10, or 20°/dataset, 1°/frame (shutterless)
- 288 and 85 datasets collected automatically (ZOO system) and manually from 10+22 cryoloops
- P212121; a=52.8, b=86.8, c=116.4 Å
GUI command 'kamo' was used by specifying exclude_resolution_range=4.55,4.4 to exclude a strong ring pattern by lipid. XDS (ver. May 1, 2016 BUILT=20160617) was used for integration and no prior crystal information was employed.
148 out of 373 datasets were indexed and integrated, and 139 datasets belonged to the largest group of consistent unit cells:
[ 1] 139 members:
Averaged P1 Cell= 52.79 86.72 116.43 90.32 90.33 90.30
Possible symmetries:
freq symmetry a b c alpha beta gamma reindex
23 P 1 52.79 86.72 116.43 90.32 90.33 90.30 a,b,c
6 P 1 2 1 52.79 86.72 116.43 90.32 90.33 90.30 a,b,c
8 P 1 2 1 86.72 52.79 116.43 89.67 90.32 89.70 b,-a,c
12 P 1 2 1 52.79 116.43 86.72 89.68 90.30 89.67 a,-c,b
90 P 2 2 2 52.79 86.72 116.43 90.32 90.33 90.30 a,b,c
As P222 symmetry was the most frequent one except P1, P222 was assumed and the XDS_ASCII files were re-indexed to P222 symmetry.
To remove outliers having extremely different unit cell parameters, filter_cell.R was used and 8 datasets were removed.
Next, several clustering procedures were tested, and finally a subcluster (the second largest cluster) in clustering result by CC was found to be the best result (having the largest CC1/2). This result consisted of 100 datasets and was found in ccc_2.6A_framecc_b+B_goodcell/cluster_0129/run_03/XSCALE.LP:
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr
7.78 18239 728 734 99.2% 13.0% 13.1% 18239 28.45 13.3% 99.7* -4 0.838 464
5.51 33853 1217 1217 100.0% 20.9% 19.3% 33853 19.22 21.3% 99.6* 2 0.885 946
4.50 40157 1457 1537 94.8% 24.4% 20.6% 40152 18.11 24.9% 99.4* -2 0.975 1182
3.90 44566 1601 1776 90.1% 34.5% 30.3% 44558 13.83 35.1% 98.4* 1 0.928 1333
3.49 57924 1996 1996 100.0% 53.9% 54.8% 57924 8.59 54.9% 97.4* -1 0.802 1729
3.18 66004 2257 2257 100.0% 105.1% 126.7% 66004 4.23 106.9% 93.4* -1 0.683 1983
2.95 68068 2329 2329 100.0% 182.8% 238.4% 68068 2.48 186.0% 87.4* -1 0.632 2065
2.76 73256 2528 2528 100.0% 309.6% 411.1% 73256 1.57 315.1% 80.8* 0 0.580 2260
2.60 78809 2767 2767 100.0% 671.2% 910.8% 78809 0.90 683.3% 60.9* -2 0.536 2494
total 480876 16880 17141 98.5% 57.6% 66.4% 480863 7.79 58.7% 99.4* -1 0.713 14456
This result was produced by the following command:
#!/bin/sh
# settings
dmin=2.6 # resolution
clustering_dmin=3.0 # resolution for CC calculation
anomalous=false # true or false
lstin=formerge_goodcell.lst # list of XDS_ASCII.HKL files
use_ramdisk=true # set false if there is few memory or few space in /tmp
# _______/setting
kamo.multi_merge \
workdir=ccc_${dmin}A_framecc_b+B_goodcell resolution.estimate=true \
lstin=${lstin} d_min=${dmin} anomalous=${anomalous} \
program=xscale xscale.reference=bmin \
reject_method=framecc+lpstats rejection.lpstats.stats=em.b+bfactor \
clustering=cc cc_clustering.d_min=${clustering_dmin} cc_clustering.b_scale=false cc_clustering.use_normalized=false \
cc_clustering.min_cmpl=90 cc_clustering.min_redun=2 \
xscale.use_tmpdir_if_available=${use_ramdisk} \
batch.engine=sge batch.par_run=merging batch.nproc_each=8 nproc=8 batch.sge_pe_name=par