# GSD Highlight changes in the protein-protein interactions of ys RNase MRP vs. RNase P via PDBsum data

This is an effort to adapt the generic notebook I made to look at protein-protein interactions in pairs of related structures to look at the combinations of protein-protein interactions for yeast RNase MRP vs. RNase P.

----


**Step #1:** Make a table with a matrix of the protein-protein combinations in the pairs of cryo-EM structures of RNase MRP and RNase P.

I used the PDBsum pages for protein-protein interactions make this table. I could have computationally generated the combinations; however, that way several will not be actually relevant and so instead of sorting out which ones actually return an 'empty' list of interactions for both structures from PDBsum (which shouldn't be too hard and could be added so down the road one could just supply two PDB code ids and let all the reports get generated, plus note that it has to be emptry for both structures because if one has none and the other has some like Pop1p[chain B] and Pop5p[chain E] that interact in Rnase P but not in RNase MRP, this is definitely important differences to catch; just to not the flip side of that is that in Rnase MRP Pop1p gains interactions with Rmp1[chain L] and Pop4p[Chain D] not seen in RNase P), I just decided to construct it myself so nothing is missed due to an error in handling the steps for doing it that way. Plus certain combinations were originally the impetus for this effort and so those were actuall added first and then I expanded out to check the other interactions.

In [1]:
s='''7c7a F G 6ah3 F G
7c7a F B 6ah3 F B
7c7a G B 6ah3 G B
7c7a E B 6ah3 E B
7c7a I B 6ah3 I B
7c7a F I 6ah3 F I
7c7a G E 6ah3 G E
7c7a I E 6ah3 I E
'''
%store s >int_matrix.txt

Writing 's' (str) to file 'int_matrix.txt'.


**Step #2:** Move the Snakefile to process the table of interactions to this directory.

In [2]:
!cp ../Snakefile .

**Step #3:** Run snakemake and it will process the `int_matrix.txt` file to extract the information and make individual notebooks corresponding to analysis of the interactions for each line.  

In [3]:
!snakemake --cores 1

[33mBuilding DAG of jobs...[0m
[33mUsing shell: /bin/bash[0m
[33mProvided cores: 1 (use --cores to define parallelism)[0m
[33mRules claiming more threads will be scaled down.[0m
[33mJob counts:
	count	jobs
	1	all
	8	convert_scripts_to_nb_and_run_using_jupytext
	1	make_archive
	1	read_table_and_create_py
	11[0m
[32m[0m
[32m[Fri Jan 22 20:27:08 2021][0m
[32mrule read_table_and_create_py:
    input: int_matrix.txt
    output: interactions_report_for_7c7a_F_G_6ah3_F_G.py, interactions_report_for_7c7a_F_B_6ah3_F_B.py, interactions_report_for_7c7a_G_B_6ah3_G_B.py, interactions_report_for_7c7a_E_B_6ah3_E_B.py, interactions_report_for_7c7a_I_B_6ah3_I_B.py, interactions_report_for_7c7a_F_I_6ah3_F_I.py, interactions_report_for_7c7a_G_E_6ah3_G_E.py, interactions_report_for_7c7a_I_E_6ah3_I_E.py
    jobid: 3[0m
[32m[0m
[33mJob counts:
	count	jobs
	1	read_table_and_create_py
	1[0m
[32m[Fri Jan 22 20:27:09 2021][0m
[32mFinished job 3.[0m
[32m1 of 11 steps (9%) done[0m
[32m

For those knowlegeable with snakemake, I will say that I set the number of cores as one because I was finding with eight that occasionally a race condition would ensue where some of the auxillary scripts by notebooks would overwrite each other as they was being accessed by another notebook causing failures. Using one core avoids that hazard. I will add though that in most cases if you use multiple cores, you can easily get the additional files and a new archive made by running snakemake with your chosen number of cores again.

I never saw a race hazard with my clean rule, and so if you want to quickly start over you can run `!snakemake --cores 8 clean`.

**Step #3:** Verify the Jupyter notebooks with the reports were generated.  
You can go to the dashboard and see the ouput of running snakemake. To do that click on the Jupyter logo in the upper left top of this notebook and on that page you'll look in  the notebooks directory and you should see files that begin with `interactions_report_` and end with `.ipynb`. You can examine some of them to insure all is as expected.

If things seem to be working and you haven't run your data yet, run `!snakemake --cores 8 clean` in a cell to reset things, and then edit & save `int_matrix.txt` to have your information, and then run the `!snakemake --cores 1` step above, again.

**Step #4:** If you don't want to fix the reports by adding the protein names (see below), download the archive..  


-----

Please continue on with the notebook `GSD Adding protein names to protein-protein interactions reports for ys RNase MRP v RNase P.ipynb` to swap the protein names into the reports for easier reading.

-----

-----

Enjoy.