[DEVELOP] Adding TAD class #34

gwaybio · 2019-10-04T11:29:26Z

Adding method to facilitate a TAD_Pathways analysis. The method calls all the scripts in order of the tad_pathways pipeline given a file storing SNP info.

This is in preparation of Diana's Longitudinal BMD paper. Not really sure of where this should all live...

The TAD_Pathways portion of the longitudinal BMD analysis consists of three steps:

Running TAD_Pathways on 41 different SNP lists
Simulating gene lists using randomly selected TADs matched by gene content similarity
Comparing implicated genes in real TAD_Pathways to permuted gene lists (and Diana digs deeper into these specific genes in downstream analyses)

At this point, nearly all of the functionality lives on the develop branch in this repo. All of it has been reviewed already (except this PR) (see #21, #22, #25, #26, #30, #31, #33). My original goal was to make this repo a python package so that Diana's analysis could live in an independent repo - but this goal is likely to require several more hours of work.

An alternative is to add the longitudinal BMD analysis here in a separate branch (can branch off of develop) and generate a release on this specific branch. I think this will take the least amount of time.

There is also the element of code review. I certainly don't want to take up any time of anyone in the lab...

Thoughts @cgreene ?

cgreene · 2019-10-10T08:09:59Z

For all of the reasons that code review is important including the opportunity to catch certain bugs, identify new solutions, and making sure code is reusable across the lab (and hopefully also the field), it seems worth going through review. This isn't an inordinately large amount of code. I bet that if you asked @dongbohu he would be happy to take a look.

I'm fine with the develop branch of this repo. You might also be able to get help from @dongbohu to make this repo into a python package. I'm not sure if he has the capacity but it seems like it'd be worth asking.

dongbohu

@gwaygenomics: Most of the code looks good. There are a few minor issues for you to address or clarify.

dongbohu · 2019-10-10T15:07:51Z

tad_pathways/config.py

+import yaml
+
+
+def config_yaml(config_file):


So I suppose this module will be used in the future by users to read in a yaml config file?

dongbohu · 2019-10-10T15:10:39Z

tad_pathways/config.py

+    # Load configuration
+    with open(config_file, "r") as stream:
+        config = yaml.load(stream)
+


Do you have a template yaml file? I guess some error handling logic will be needed in the future to validate the input yaml file.

This isn't currently used, but will be in the future. I will add error handling in a future PR (see #35)

dongbohu · 2019-10-10T15:11:07Z

tad_pathways/config.py

@@ -0,0 +1,31 @@
+"""
+Gregory Way 2018


dongbohu · 2019-10-10T15:11:20Z

tad_pathways/genes/genelist.py

@@ -0,0 +1,167 @@
+"""
+Gregory Way 2018


dongbohu · 2019-10-10T15:13:52Z

tad_pathways/genes/genelist.py

+"""
+Gregory Way 2018
+TAD Pathways
+scripts/genes/genelist.py


This path doesn't match the actual location.

dongbohu · 2019-10-10T16:00:01Z

tad_pathways/tad.py

+
+    def build_custom_tad_genelist(self):
+        command_list = [
+            "python",


Is scripts/build_custom_tad_genelist.py compatible for Python 2 or 3 (or both)? By default, python command calls python 2 on many (if not most) operating systems.

python 3 - should I change the call to "python3"?

If this package is for Python 3 only, then it is a good idea to change it to python3,

done in fbe303d

dongbohu · 2019-10-10T16:02:28Z

tad_pathways/tad.py

+        output_pval_file = os.path.join(
+            self.base_dir, "{}_pvals.tsv".format(self.snp_list_name)
+        )
+        if os.path.exists(output_pval_file):


If you want, lines 136-139 can be simplified as:

return os.path.exists(output_pval_file)

dongbohu · 2019-10-10T16:02:58Z

tad_pathways/tad.py

+    def get_evidence(self):
+        # The first command builds the evidence
+        command_list = [
+            "python",


Same question on Python 2/3 compatibility.

dongbohu · 2019-10-10T16:04:26Z

tad_pathways/tad.py

+
+        # The second command summarizes this evidence
+        command_list = [
+            "python",


Same question on python 2/3 compatibility.

dongbohu · 2019-10-10T16:06:35Z

tad_pathways/tad.py

+        webgestalt_exists = self.check_webgestalt()
+        if webgestalt_exists:
+            self.get_evidence()
+            return 1


Just curious, the return value of this method will be used later?

Hmm, doesn't look like I use this.. Not sure why its here. Maybe some functionality that I decided not to use. I will remove for now

DCousminer · 2019-11-01T16:59:44Z

Proposed next steps to complete the enrichment analysis:

Output from step 2: divvy into 100 sets of 41 matched TADs
Filter by protein-coding genes
Submit to DAVID for functional annotation
Count number of times genes appear with specific key words of interest (e.g. "osteoblast"), being careful to count each gene only once
Create null distribution based on 100 iterations
Compare actual and calculate p-value for enrichment of "real" list vs. simulated

We are working on this-- any thoughts before we proceed are welcome.

update documentation, update all_pathways argument, streamline return statements

update documentation, remove remove_hla_tad argument from compile()

gwaybio · 2019-12-04T12:47:11Z

@dongbohu - ready again for re-review!

dongbohu

Looks good!

gwaybio added 4 commits October 4, 2019 07:06

add tad.py

c845b20

add config python method

c387d51

add genelist method

8481889

run black on added files

f2b9815

dongbohu reviewed Oct 10, 2019

View reviewed changes

gwaybio added 2 commits November 22, 2019 08:22

respond to PR comments

aa27d8b

update documentation, update all_pathways argument, streamline return statements

respond to pr comments

fdae079

update documentation, remove remove_hla_tad argument from compile()

gwaybio mentioned this pull request Nov 22, 2019

Validate tad_pathways config file #35

Open

gwaybio added 4 commits November 22, 2019 08:27

fix year

a3c4877

add keyerror handling

2f33c4c

fix documentation in construct_evidence

3e6142b

change to python3 call

fbe303d

gwaybio added 3 commits December 4, 2019 08:21

minor updates to flags

ca410cc

use readr::write_tsv() and fix webgestalt updates

c414ca8

fix for webgestalt updates

8c123aa

dongbohu approved these changes Dec 4, 2019

View reviewed changes

gwaybio merged commit 2de630d into greenelab:develop Dec 19, 2019

gwaybio deleted the add-tad-class branch December 19, 2019 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DEVELOP] Adding TAD class #34

[DEVELOP] Adding TAD class #34

gwaybio commented Oct 4, 2019

cgreene commented Oct 10, 2019

dongbohu left a comment

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019

gwaybio Nov 22, 2019

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019 •

edited

Loading

gwaybio Nov 22, 2019

dongbohu Nov 22, 2019

gwaybio Nov 28, 2019

dongbohu Oct 10, 2019

gwaybio Nov 22, 2019

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019

dongbohu Oct 10, 2019

gwaybio Nov 22, 2019

DCousminer commented Nov 1, 2019

gwaybio commented Dec 4, 2019

dongbohu left a comment

[DEVELOP] Adding TAD class #34

[DEVELOP] Adding TAD class #34

Conversation

gwaybio commented Oct 4, 2019

cgreene commented Oct 10, 2019

dongbohu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongbohu Oct 10, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DCousminer commented Nov 1, 2019

gwaybio commented Dec 4, 2019

dongbohu left a comment

Choose a reason for hiding this comment

dongbohu Oct 10, 2019 •

edited

Loading