# Making summary report notebooks & associated files for the hu.MAP3 complexes for many proteins

Work through the steps to run a Snakemake workflow to process the identifiers for many proteins/genes to generate summary reports for what human complexes are represented for each.

------------

## Step #1: Preparation

Make a list of the identifiers for the proteins of interest. You can use the official gene name or the UniProt identifiers for the protein. Either work. You'll put them below the line `%%%writefile Make_me_humap3_complex_reports_for_these.txt`, with one on each line. A demonstration is below and you may just want to run things witih that first.


In [None]:
%%writefile Make_me_humap3_complex_reports_for_these.txt
FBL
Q9NX24
XRN1
FakeName_for_testing

(Edit the above list to below the line `%%writefile Make_me_humap3_complex_reports_for_these.txt` have the UniProt extensions or the gene names for the corresponding proteins of interest, one on each line. If you are running this the first time, I suggest using the demostration identifiers to see if it all works and then edit and re-run according to suggestions below. Or just restart a new temporary session to get back to square one. If you have run this before though, edit above at this point to get making reports witih what you are interested in.)

------------

## Step #2: Running Snakemake to make the many summary reports and associated files

Run snakemake and it will process the list of interesting ids to extract the information and make individual notebooks corresponding to hu.MAP 3.0 data for each protein. This will be very similar to running the basic notebooks in this series, but it will do it for all programmatically.  
The file snakemake uses by default, named Snakefile, is already here and that is what will run when the next command is executed.
It will take about a few minutes to complete if you are running the demonstration. If you edited things it will take longer.

In [None]:
!snakemake -s id_2_humap3_complexes_snakefile --cores 1

**Step #4:** Verify the Jupyter notebooks with the reports were generated.  
If you ran the demo ones, you can click [here](Summary_report_humap3_data_for_FBL.ipynb) to open one of them.  For the others, you should see them listed in the file browser pane to the left.

If things seem to be working and you haven't run your data yet, run `!snakemake -s id_2_humap3_complexes_snakefile --cores 1 clean` in a cell to reset things, and then edit the list above below `%%writefile Make_me_humap3_complex_reports_for_these.txt` to have the identifiers of interest each on a line, run that edited cell, and then run the `!snakemake -s id_2_humap3_complexes_snakefile --cores 1` step above, again.

Download anything useful you make, see the next step because there is an easy zip file you can grab to get everything related.

**Step #5:** If this was anything other than the demonstration run, download the archive containing all the Jupyter notebooks bundled together.  
For ease in downloading, all the created notebooks and associated files have been saved as a zip archive so that you only need to retrieve and keep track of one file. The file you are looking for begins with `complexes_report_nbs_and_files_` in front of a date/time stamp and ends with `.zip`. The snakemake run will actually highlight this archive towards the very bottom of the run, following the words 'Be sure to download'.  
**Download that file from this remote, temporary session to your local computer.** You should see this archive file ending in `.zip` in the file browser pane to the left. Toggle next to it to select it and then select `Download` to bring it from the remote Jupyterhub session to your computer. If you don't retieve that file and the session ends, you'll need to re-run to get the results again.

You should be able to unpack that archive using your favorite software to extract zip files. If that is proving difficult, you can always reopen a session like you did to run this series of notebooks and upload the archive and then run the following command in a Jupyter notebook cell to unpack it:

```bash
!unzip interactions_report_nbs*
```

(If you are running that command on the command line, leave off the exclamation mark.)
You can then examine the files in the session or download the individual Jupyter notebooks or associated `.tsv` files similar to the advice on how to download the archive given above.


If this notebook has you interested in learning more about Snakemake as workflow management software, I did use a somewhat related, yet distinct and simpler workflow, to provide more background to using Snakemake in the notebook 'Making multiple interface-reporting dataframes for several structures using snakemake' available when you go [here](https://github.com/fomightez/pdbepisa-binder) and click `launch binder`. That Jupyter notebook also suggests further resources for learning to write Snakemake workflows.

-----

Enjoy!

See my [humap3-binder repo](https://github.com/fomightez/humap3-binder) and [humap3-utilities](https://github.com/fomightez/structurework/humap3-utilities) for related information & resources for this notebook.



-----
