Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accounting for directionality of dataset data-BioGrid-Yeast #8

Open
T-Wisse opened this issue Dec 16, 2020 · 3 comments
Open

Accounting for directionality of dataset data-BioGrid-Yeast #8

T-Wisse opened this issue Dec 16, 2020 · 3 comments

Comments

@T-Wisse
Copy link
Owner

T-Wisse commented Dec 16, 2020

I have been messing around with your scripts, and for interaction data it uses the excel file data-BioGrid-Yeast, which has interaction data for a large number of genes. There are a number of duplicates in the file but that is not a problem. While testing with BEM2 I observed that the list of BEM2 as query is not identical to the list of BEM2 as a target. This makes sense as the dataset is large and you don't want duplicate entries. However, in the code it only searches for the query gene (BEM2) in the query column, which causes a number of interactions to be missed. For now I took the easy way out and simply duplicated the dataset, switching query and target columns, which results in quite a different figure (top new, bottom old). I am not sure if you were actually using this dataset but its good to know.
Figure 2020-12-16 093057
Figure 2020-12-15 083822

@T-Wisse T-Wisse added the bug Something isn't working label Dec 16, 2020
@leilaicruz
Copy link
Collaborator

Hi Thomas, the point here is to know what are the concepts behind query and target, which refer to the way SGA works. I would read first again that part of the "Genetic Networks" by Constanzo. In general , I take the query column as the gene of interest and then look to the unique values in the target column as their interactors. The query is the initial mutation and the target are the double deletion genotypes found after sporulation, and are the ones they could measure the fitness of. Another point is that genetic interactions are not bidirectional, that is that if A is positive interactor of B in certain background does not mean that B is a positive interactor of A, for example the case of bem1 and bem3. bem3 is a positive interactor of bem1 however bem1 is not a positive interactor of bem3 within the same definition(dbem1dbem3 growth rate < dbem3 growth rate).
Hence that is why I took as a convention to always look the interactors in the target column that correspond to certain gene in the query column.
does that make sense to you?

@T-Wisse
Copy link
Owner Author

T-Wisse commented Dec 16, 2020

The part about SGA makes sense yes. However the way I understood it the genetic interaction should be bidirectional. That dbem1dbem3 growth rate < dbem3 growth rate does not mean they do not positively interact. It should be dbem1dbem3 growth rate <= Expected dbem1dbem3 growth rate. Basically that while introducing the bem1 delete to the bem3 delete does lower the fitness, the interaction is positive because it does not lower as much as expected (as seen in wild type). If I made an error in my thinking there you are right to do it that way of course.

@leilaicruz
Copy link
Collaborator

leilaicruz commented Dec 16, 2020 via email

@T-Wisse T-Wisse removed the bug Something isn't working label Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants