## Metareader – experimental resolution for X-Ray and Cryo-EM structures

The goal of this notebook is to show how the `metareader` script can be used to:

- identify the experimental method used to solve a structure,
- check the experimental resolution from mmCIF metadata.

### Two example structures are used:

- **1UBQ** – structure solved by X-Ray diffraction,
- **7K00** – structure solved by Cryo-EM.

Input files are local mmCIF files stored in: `data/pdb/`


## Example 1: X-Ray structure (1UBQ)


### Listing available metadata categories

Before extracting any information, we check which metadata categories are present
in an mmCIF file.


In [1]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/1ubq.cif \
  --list-categories | head -n 30

Traceback (most recent call last):
  File [35m"/tmp/tmp.DSKFnZvuxa/rnapolis-py/notebooks/../src/rnapolis/metareader.py"[0m, line [35m111[0m, in [35m<module>[0m
    [31mmain[0m[1;31m()[0m
    [31m~~~~[0m[1;31m^^[0m
  File [35m"/tmp/tmp.DSKFnZvuxa/rnapolis-py/notebooks/../src/rnapolis/metareader.py"[0m, line [35m94[0m, in [35mmain[0m
    file = handle_input_file(args.path)
  File [35m"/tmp/tmp.DSKFnZvuxa/rnapolis-py/.venv/lib/python3.13/site-packages/rnapolis/util.py"[0m, line [35m18[0m, in [35mhandle_input_file[0m
    with [31mopen[0m[1;31m(path)[0m as f:
         [31m~~~~[0m[1;31m^^^^^^[0m
[1;35mFileNotFoundError[0m: [35m[Errno 2] No such file or directory: '../data/pdb/1ubq.cif'[0m


From the list of available categories we can see that this structure was solved by X-ray
diffraction. This is indicated by the presence of categories such as:

- `exptl`
- `diffrn`
- `refine`

and by the absence of any EM-related categories. Therefore, the experimental resolution
should be read from the `refine` category.


### Checking the experimental method

The experimental method is stored in the `exptl` category.


In [17]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/1ubq.cif \
  -c exptl \
  > ../outputs/1ubq_exptl.json

In [18]:
!head ../outputs/1ubq_exptl.json

{"struct":[{"entry_id":"1UBQ","title":"STRUCTURE OF UBIQUITIN REFINED AT 1.8 ANGSTROMS RESOLUTION","pdbx_model_details":"?","pdbx_CASP_flag":"?","pdbx_model_type_details":"?"}],"exptl":[{"entry_id":"1UBQ","method":"X-RAY DIFFRACTION","crystals_number":"?"}]}


In the output, the experimental method is stored in the `exptl` category.
We look for the field `method`:

"exptl":[{"entry_id":"1UBQ",**"method"**:"X-RAY DIFFRACTION"

This confirms that the structure 1UBQ was solved using **X-ray diffraction**.


### Checking the experimental resolution

For X-ray structures the experimental resolution is stored in the `refine` category,
most commonly in the field `ls_d_res_high`.


In [23]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/1ubq.cif \
  -c refine \
  > ../outputs/1ubq_refine.json

In [32]:
!head -c 500 ../outputs/1ubq_refine.json

{"struct":[{"entry_id":"1UBQ","title":"STRUCTURE OF UBIQUITIN REFINED AT 1.8 ANGSTROMS RESOLUTION","pdbx_model_details":"?","pdbx_CASP_flag":"?","pdbx_model_type_details":"?"}],"refine":[{"entry_id":"1UBQ","ls_number_reflns_obs":"?","ls_number_reflns_all":"?","pdbx_ls_sigma_I":"?","pdbx_ls_sigma_F":"?","pdbx_data_cutoff_high_absF":"?","pdbx_data_cutoff_low_absF":"?","pdbx_data_cutoff_high_rms_absF":"?","ls_d_res_low":"?","ls_d_res_high":"1.8","ls_percent_reflns_obs":"?","ls_R_factor_obs":"0.1760

In the output, the experimental resolution is stored in the `refine` category.
We look for the field `ls_d_res_high`:


"refine": [
{
"entry_id": "1UBQ",
"ls_number_reflns_obs": "?",
"ls_number_reflns_all": "?",
"pdbx_ls_sigma_I": "?",
"pdbx_ls_sigma_F": "?",
"pdbx_data_cutoff_high_absF": "?",
"pdbx_data_cutoff_low_absF": "?",
"pdbx_data_cutoff_high_rms_absF": "?",
"ls_d_res_low": "?",
"ls_d_res_high": "1.8"
}
]

The experimental resolution for the X-ray structure is given by the field
`ls_d_res_high`, which in this case equals **1.8 Å**.


### For 1UBQ:

- method: read from `exptl -> method`,
- resolution: read from `refine -> ls_d_res_high`.


## Example 2: Cryo-EM structure (7K00)


### Listing available categories


In [43]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/7k00.cif \
  --list-categories | head -n 30

entry
audit_conform
database_2
pdbx_audit_revision_history
pdbx_audit_revision_details
pdbx_audit_revision_group
pdbx_audit_revision_category
pdbx_audit_revision_item
pdbx_database_status
pdbx_database_related
audit_author
citation
citation_author
entity
entity_name_com
entity_poly
pdbx_entity_nonpoly
entity_poly_seq
entity_src_nat
pdbx_entity_src_syn
chem_comp
pdbx_poly_seq_scheme
pdbx_nonpoly_scheme
pdbx_unobs_or_zero_occ_atoms
cell
symmetry
exptl
struct
struct_keywords
struct_asym


In Cryo-EM structures the set of categories is different from X-ray entries.
The `refine` and `diffrn` categories are usually absent, while categories related
to electron microscopy and 3D reconstruction are present.


### Checking the experimental method


In [45]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/7k00.cif \
  -c exptl \
  > ../outputs/7k00_exptl.json

In [47]:
!head -c 500 ../outputs/7k00_exptl.json

{"struct":[{"entry_id":"7K00","title":"Structure of the Bacterial Ribosome at 2 Angstrom Resolution","pdbx_model_details":"?","pdbx_formula_weight":"?","pdbx_formula_weight_method":"?","pdbx_model_type_details":"?","pdbx_CASP_flag":"N"}],"exptl":[{"absorpt_coefficient_mu":"?","absorpt_correction_T_max":"?","absorpt_correction_T_min":"?","absorpt_correction_type":"?","absorpt_process_details":"?","entry_id":"7K00","crystals_number":"?","details":"?","method":"ELECTRON MICROSCOPY","method_details"

In the output we again look for:

exptl -> method


"exptl": [
{
"absorpt_coefficient_mu": "?",
"absorpt_correction_T_max": "?",
"absorpt_correction_T_min": "?",
"absorpt_correction_type": "?",
"absorpt_process_details": "?",
"entry_id": "7K00",
"crystals_number": "?",
"details": "?",
"method": "ELECTRON MICROSCOPY"
}
]


This confirms that the structure 7K00 was solved using **Cryo-EM**.


### Checking the experimental resolution


In [53]:
!python ../src/rnapolis/metareader.py \
  ../data/pdb/7k00.cif \
  -c em_3d_reconstruction \
  > ../outputs/7k00_resolution.json

In [52]:
!head ../outputs/7k00_resolution.json

{"struct":[{"entry_id":"7K00","title":"Structure of the Bacterial Ribosome at 2 Angstrom Resolution","pdbx_model_details":"?","pdbx_formula_weight":"?","pdbx_formula_weight_method":"?","pdbx_model_type_details":"?","pdbx_CASP_flag":"N"}],"em_3d_reconstruction":[{"entry_id":"7K00","id":"1","algorithm":"?","details":"Ewald sphere corrected in RELION","refinement_type":"?","image_processing_id":"1","num_class_averages":"?","num_particles":"307495","resolution":"1.98","resolution_method":"FSC 0.143 CUT-OFF","symmetry_type":"POINT","method":"?","nominal_pixel_size":"?","actual_pixel_size":"?","magnification_calibration":"?"}]}


In Cryo-EM entries the experimental resolution is typically stored in a `em_3d_reconstruction` category. We look for the `resolution` field:


"em_3d_reconstruction": [
{
"entry_id": "7K00",
"id": "1",
"algorithm": "?",
"details": "Ewald sphere corrected in RELION",
"refinement_type": "?",
"image_processing_id": "1",
"num_class_averages": "?",
"num_particles": "307495",
"resolution": "1.98"
}
]


In this case the experimental resolution of the structure equals **1.98 Å**


### For 7K00:

- method: read from `exptl -> method`,
- resolution: read from `em_3d_reconstruction -> resolution`.


## Summary

Using `metareader`, experimental information can be extracted directly from
mmCIF metadata:

For X-Ray structures (e.g. 1UBQ):

- method: read from `exptl -> method`,
- resolution: read from `refine -> ls_d_res_high`.

For Cryo-EM structures (e.g. 7K00):

- method: read from `exptl -> method`,
- resolution: read from `em_3d_reconstruction -> resolution`.


This shows how experimental resolution can be verified for both X-Ray and Cryo-EM
structures using the same tool, based only on the metadata stored in the mmCIF files.
