Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nuc_process for hybrid hybrid strain analysis #10

Closed
chenggang108 opened this issue Apr 23, 2020 · 1 comment
Closed

nuc_process for hybrid hybrid strain analysis #10

chenggang108 opened this issue Apr 23, 2020 · 1 comment

Comments

@chenggang108
Copy link

Can you please provide an example file of the homologous_chromos HOM_CHROMO_TSV_FILE? Or explain it a little bit.

Thanks

Gang

@tjs23
Copy link
Owner

tjs23 commented Apr 24, 2020

This file should be a tab-separated list of paired chromosome/sequence identifiers (i.e. one pair per line), where each pair specifies two chromosome sequences, from different genome builds, which are homologous. However, this mechanism is now deprecated for calculating structures from hybrid data. Instead it is better to use the latest "master" branch version (not the first release version), which is more advanced.

In the more recent version there should be a chromosome naming file for each genome build (specified with -cn and -cn2 flags); two columns, space/tab separated. These map from the chromosome sequence identifier in the first column (e.g. as appears in the FASTA sequence file) to a simple name, like "chr1", in the second column. For example, for mouse build mm10:

NC_000067.6	chr1
NC_000068.7	chr2
NC_000069.6	chr3
...

The simple name should match another chromosome in the other naming file to define a homologous pair. The "nuc_sequence_names" program is provided to automatically create a naming file; seeking simple chromosome names from the sequence accession codes found in a FASTA file of a genome build. Note that if the sequence names for the genome build are already "chr1", "chr2", "chr3" etc. Then you can use a naming file of the form:

chr1	chr1
chr2	chr2
chr3	chr3
...

And naturally the second column should match the names for the hybrid's other genome build.

@tjs23 tjs23 closed this as completed Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants