Skip to content

Project files for the family pedigree networkx file enumeration.

Notifications You must be signed in to change notification settings

J-Pesos/GSEnumeration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

### Enumeration of family relationships in a relationships file.

A python script for the enumeration of different family relationships in a
family relations .txt or .nx file.

### Why should I use this project ?

This script allows you to quickly enumerate the number and types of relationships
found within any family relations .txt file.

### Setup

You need `python>3.8` to run this script.

The project depends on the `pandas`, `networkx`, `itertools`, `numpy`, and `argparse` modules, install them with pip:
`pip install pandas`
`pip install networkx`
etc...

### How to run?

You can run the Enumeration script from the command-line using
```
python Enumeration_Final.py -n -gd -me -t -o
```
Where:
-n --networkx is the whole name of the family relations .txt file.

-gd --generation is the integer of generation depth you want to search for.

-me --meioses is the integer of meioses events you want to search for.

-t --type is the type of relationships you want to search for. These include (half, full, direct, NA).
Where a direct relationsip is a direct descendant/ancestor and NA applies to exclusions such as comparing an individual against themselves.

-o --output is a string that will be the file name of the results output from the Enumeration script.
```
Once you have three matrix files for an enumerated family (generation_depth.xlsx, meioses_event.xlsx, and half_full.xlsx) you can run the Relationship_Search script from command-line using
```
python Enumeration_Final.py -gd -me -t -o
```
Where:
-gd --generation is the integer of generation depth you want to search for.

-me --meioses is the integer of meioses events you want to search for.

-t --type is the type of relationships you want to search for. These include (half, full, direct, NA).
Where a direct relationsip is a direct descendant/ancestor and NA applies to exclusions such as comparing an individual against themselves.

-o --output is a string that will be the file name of the results output from the Relationship_Search script.
```
Enumeration.py is the main enumeration script. Input is a family relations .txt or .nx file. Output is a results .csv file with every relationship represented by each row. Relationships returned can also be output if the user searched for a specific type of relationshiup during the initial run - where the output files contians all individual pairs that match the queried relationship. Finally, Enumeration.py will output three separate files based on the relationship matrices: generation_depth.xlsx, meioses_event.xlsx, and half_full.xlsx for the final family listed within a family relations file (if multiple are present).

Relationship_Search.py is code that takes in a generation_depth.xlsx, meioses_event.xlsx, and half_full.xlsx excel files generated from Enumeration.py for an individual family. Relationship_Search.py can then be run to search relationship metrics in the same directory for a specific relationship type in a family represented by the three excel files. Output is written to the console before starting another search.

Visualization is an R script that provides the code template for creating beeswarm plots out of the results .csv output from Enumeration.py.

### How to cite this project?

Please email `joaquinmmagana@gmail.com` to get instructions on how to properly cite this project.

### Contributing

Please contact `joaquinmmagana@gmail` or Joaquín Magaña via Slack for contributions.

About

Project files for the family pedigree networkx file enumeration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published