Software used for processing, analysis, and plotting of data from:

Systematic analysis of low-affinity transcription factor binding site clusters in vitro and in vivo establishes their functional relevance

https://www.biorxiv.org/content/10.1101/2021.12.17.473130v1

Required dependencies can be found in the Requirements.txt file.

There are four jupyter notebooks:

in-vivo-analysis_Zif268-Pho4.ipynb

Contains scripts used for the in vivo analysis. It can be run independently of the other three notebooks.

in-vitro-analysis_Zif268.ipynb

Contains scripts used for the in vitro analysis of Zif268 data. For the section that relates gene expression to mean occupancy, it requires data generated from the in-vivo-analysis_Zif268-Pho4.ipynb notebook.

in-vitro-analysis_Pho4.ipynb

Contains scripts used for the in vitro analysis of Pho4 data. It can be run independently of the other three notebooks.

in-vitro-summary-plotting_Zif268-Pho4.ipynb

Contains scripts and plotting functionality to generate summary plots used in the manuscript, for both Zif268 and Pho4 data. Accordingly, the Zif268 and Pho4 in-vitro-analysis notebooks should be run first.

Jupyter notebook cells can be run simply from start to finish (recommended to use Jupyter Lab, and it's possible to run them in the order that they are listed above).

There is virtually no install time required.

Raw data is available in this project, and some intermediate data is also available directly, so that most plotting scripts can be run and plots visualized by the user without having to re-run Markov Chain Monte Carlo (MCMC). Running MCMC is the most time consuming and computationally intensive section of the code. Running MCMC with 10,000 steps will generally take a few hours. Due to these steps, running the full code with all of the different models will take roughly a full day.

The expected output from the code is generally explained with the detailed headers (markdown cells) found throughout the code.

Binding site information:

All in vitro DNA targets are 90bp in length.

Pho4

"X" stands for non-specific DNA designed not to bind to transcription factor.
"S" represents a strong binding site.
"M" represents a weak binding site.
"W" represents a very-weak binding site.
Ex: M1, M2, M3, are different members of the weak class of binding sites.
Two different notations are used. Either all non-primer regions are specified:
Ex: S1XXXX represents the DNA target with only the single consensus binding
site in the position furthest from the chip's surface. With remaining DNA designed to be non-binding to TF.
Or in brackets notation, (gap distance) is specified, corresponding to non-specific
basepairs.

Zif268

"A" represents a binding site, without referring to its affinity class.
A11 represents the consensus, strong binding site.
(The S naming is not used for Zif268)
Everything else is similar to Pho4, however with the additional convention
that negative gap distances can be specified for binding sites that share
common basepairs (similar to $\Delta$) in the manuscript.
Ex: A11(-3)A11 refers to two neighbring consensus binding sites that share
three basepairs in common (overlapping)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
in-vitro_fits_plots		in-vitro_fits_plots
in-vivo_fits_plots		in-vivo_fits_plots
jupyter_notebooks		jupyter_notebooks
obj		obj
processed_datasets		processed_datasets
source_data		source_data
LICENSE.md		LICENSE.md
README.md		README.md
Requirements.txt		Requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Software used for processing, analysis, and plotting of data from:

Systematic analysis of low-affinity transcription factor binding site clusters in vitro and in vivo establishes their functional relevance

https://www.biorxiv.org/content/10.1101/2021.12.17.473130v1

There are four jupyter notebooks:

in-vivo-analysis_Zif268-Pho4.ipynb

in-vitro-analysis_Zif268.ipynb

in-vitro-analysis_Pho4.ipynb

in-vitro-summary-plotting_Zif268-Pho4.ipynb

Binding site information:

Pho4

Zif268

About

Releases

Packages

Languages

License

eukaryoting/systematic_analysis_of_low-affinity_clusters

Folders and files

Latest commit

History

Repository files navigation

Software used for processing, analysis, and plotting of data from:

Systematic analysis of low-affinity transcription factor binding site clusters in vitro and in vivo establishes their functional relevance

https://www.biorxiv.org/content/10.1101/2021.12.17.473130v1

There are four jupyter notebooks:

in-vivo-analysis_Zif268-Pho4.ipynb

in-vitro-analysis_Zif268.ipynb

in-vitro-analysis_Pho4.ipynb

in-vitro-summary-plotting_Zif268-Pho4.ipynb

Binding site information:

Pho4

Zif268

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages