# GraphPed
> A novel graph-based visualization method for large and complex pedigrees

Author: Yin Huang

Citation:

## Features
- It can deal with complex families, wrong pedigrees, and multiple groups in a family.

- It can help to check pedigrees. 

- It can show multiple traits or status in one pedigree.

- It can show pedigrees in jupyter notebook and output as common image format (pdf, svg, png).

## Install

`pip install graphped`

## How to use

#### 1. In command line

In [1]:
!GraphPed -h

usage: GraphPed [-h] [-p PED] [-o OUTPUT] [-f FORMAT] [-a ATTRIBUTES]
                [-e ENGINE]

The arguments of graphped

optional arguments:
  -h, --help            show this help message and exit
  -p PED, --ped PED     a ped file or an extended ped file (default: None)
  -o OUTPUT, --output OUTPUT
                        output folder (default: ./)
  -f FORMAT, --format FORMAT
                        the format of the output picture (default: svg)
  -a ATTRIBUTES, --attributes ATTRIBUTES
                        the attributes of the output picture (default: None)
  -e ENGINE, --engine ENGINE
                        the engine of graphviz rendering the output picture
                        (default: dot)


- 1. standard pedigrees in the ped file

```
GraphPed -p data/example_fam.ped -o data/cli/ -f pdf
```


- 2. extended pedigrees in the ped file

```
GraphPed -p data/example_fam_ext.ped -o data/cli/ -f svg -a data/default.yaml 
```

#### 2.In jupyter notebook

In [2]:
from graphped.plot import *

In [3]:
fam=readped('data/example_fam.ped')
plotped(fam)

ExecutableNotFound: failed to execute PosixPath('dot'), make sure the Graphviz executables are on your systems' PATH

<graphviz.graphs.Digraph at 0x7fe74580b2e0>

Or

In [None]:
show(GraphPed(fam))

`GraphPed` function can plot all the pedigrees in the fam dataframe.

Adding self-defined attributes. the number of traits in the input file should match with the number of traits in the attribute yaml file.

In [None]:
attrs=load_attributes('data/default.yaml')
famext=readped('data/example_fam_ext.ped',attrs)
plotped(famext,attrs)

Write to output folder with pdf format

In [None]:
plotped(famext,attrs,output='data/jpn',format='pdf')

Or output multiple pedigrees.

In [None]:
GraphPed(famext,attrs,output='data/jpn',format='pdf')

## Tutorial

### Setting the attribute yaml file

- reference: 
    - fillcolor https://graphviz.org/docs/attrs/fillcolor/
    - style https://graphviz.org/docs/attrs/style/
    - fontcolor https://graphviz.org/docs/attrs/fontcolor/
    - ... https://graphviz.org/doc/info/attrs.html

For one trait ped file, if the trait values are affected status, which should be coded as follows: -9 or 0 is missing,1 is unaffected, and 2 is affected. you don't need to set the attribute file. Otherwise, you need to set your attribute file by following:

The format of the attribute of yaml file
```
trait name:
    attribute name:
        (the pairs of tait value and attribute value)
        tait value1: attribute value1
        tait value2: attribute value2
        ...
```
If you have more than one traits, you need to set each trait separately in the yaml file.
The following is an example.

In [None]:
%%writefile data/default.yaml

trait1:
    fillcolor:
        1: 'white'
        2: 'dimgrey'
        -9: 'aquamarine3'

trait2:
    style:
        True: filled,setlinewidth(4)
        False: filled
    

trait3:
    fontcolor:
        True: darkorange
        False: black
    

In [None]:
attrs=load_attributes('data/default.yaml')

In [None]:
attrs

### Two example pedigrees
one is standard, one is extended with 3 traits.

#### Standard ped file

In [None]:
%%writefile data/example_fam.ped
Fam	F4	P3	F1	1	1
Fam	F3	P3	F1	2	1
Fam	F2	P3	F1	2	1
Fam	F1	P1	P2	2	2
Fam	P3	0	0	1	2
Fam	P1	0	0	1	-9
Fam	P2	0	0	2	-9

In [None]:
fam=readped('data/example_fam.ped')

In [None]:
fam

In [None]:
plotped(fam)

In [None]:
plotped(fam)

#### Extended ped file

In [None]:
%%writefile data/example_fam_ext.ped
Fam1	F4	P3	F1	1	1	True	False
Fam1	F3	P3	F1	2	1	True	True
Fam1	F2	P3	F1	2	1	True	False
Fam1	F1	P1	P2	2	2	True	False
Fam1	P3	0	0	1	2	True	False
Fam1	P1	0	0	1	-9	False	True
Fam1	P2	0	0	2	-9	False	True

In [None]:
famext=readped('data/example_fam_ext.ped',attrs)

In [None]:
famext

In [None]:
plotped(famext,attrs)

In [None]:
dots=GraphPed(fam)

In [None]:
show(dots)

### Write out plots

In [None]:
plotped(fam,output='data/exampleplots',format='png')

Show the plot from `data/exampleplots/Fam.png`

![data/exampleplots/Fam.png](data/exampleplots/Fam.png "Fam.png")

## Real data examples

In [None]:
all_fam=readped('data/Fig_2_3_fam.ped')

In [None]:
all_fam

#### Fig.1 The workflow of GraphPed

show workflow

#### Fig.2 The pedigrees of complex families

In [None]:
plotped(all_fam[all_fam.fid=='25_2'])

#### Fig.S2 The largest pedigree in ADSP

In [None]:
plotped(all_fam[all_fam.fid=='4_649'])

#### Fig.3 The pedigrees with incorrect information

In [None]:
plotped(all_fam[all_fam.fid=='10R_R99'])

#### Fig.4 The pedigrees with multiple phenotypes

self-defined multiple-trait yaml

In [None]:
%%writefile data/self_defined_mutiple_traits.yaml

ad:
    fillcolor:
        1: 'white'
        2: 'dimgrey'
        -9: 'aquamarine3'

vcf:
    style:
        True: filled,setlinewidth(4)
        False: filled
    

trim:
    fontcolor:
        True: darkorange
        False: black
    

In [None]:
attrs=load_attributes('data/self_defined_mutiple_traits.yaml')
ped=readped('data/Fig4_fam_ext.ped',attrs)

In [None]:
ped

In [None]:
plotped(ped,attrs)

#### Show multiple figures with self-defined attributes

self-defined single-trait yaml

In [None]:
%%writefile data/self_defined_single_trait.yaml

ad:
    fillcolor:
        1: 'white'
        2: 'dimgrey'
        -9: 'aquamarine3'

In [None]:
attr=load_attributes('data/self_defined_single_trait.yaml')
all_fam=readped('data/Fig_2_3_fam.ped',attr)

In [None]:
all_fam

In [None]:
dots=GraphPed(all_fam,attr)
show(dots)