Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HiCKRy.py Key errors #41

Open
Jssmith91 opened this issue Apr 9, 2021 · 8 comments
Open

HiCKRy.py Key errors #41

Jssmith91 opened this issue Apr 9, 2021 · 8 comments

Comments

@Jssmith91
Copy link

Hi

I have been trying to run HiCKRy.py on data dumped from Juicer. The contact counts were dumped using no normalisation at 1kb resolution (we have greater than 4 billion contacts) The error i keep getting from HiCKRy.py is as follows:

Creating sparse matrix...
Traceback (most recent call last):
File "HiCKRy.py", line 283, in
main()
File "HiCKRy.py", line 276, in main
matrix,revFrag = loadfastfithicInteractions(args.interactions, args.fragments)
File "HiCKRy.py", line 45, in loadfastfithicInteractions
x.append(fragDic[chrom1][mid1])
KeyError: '1'

The contacts file generated looks like this:

1 87000 1 87000 2.0
1 87000 1 88000 1.0
1 137000 1 139000 1.0
1 181000 1 181000 17.0
1 181000 1 182000 2.0
1 182000 1 182000 1.0
1 187000 1 190000 1.0
1 190000 1 191000 1.0
1 597000 1 598000 1.0
1 598000 1 599000 1.0

The fragment file generated looks like this:

chr1 0 500 1 1
chr1 1000 1500 1 1
chr1 2000 2500 1 1
chr1 3000 3500 1 1
chr1 4000 4500 1 1
chr1 5000 5500 1 1
chr1 6000 6500 1 1
chr1 7000 7500 1 1
chr1 8000 8500 1 1
chr1 9000 9500 1 1

Do you have any suggestions for this or would it be easier to dump the contacts from Juicer with the KR normalisation already applied?

Thanks in advance,

James

@aryakaul
Copy link
Collaborator

aryakaul commented Apr 9, 2021

Hey James,

You'll want to make sure that the chromosome names for both files are identical. So contacts file should be 'chr1' instead of '1', something like this awk command should do it: awk '{printf "chr%s\t%s\tchr%s\t%s\t%s\n",$1,$2,$3,$4,$5}' $CONTACTSFILE

@Jssmith91
Copy link
Author

Hi these now have exactly the same naming and i still get the following:

Creating sparse matrix...
Traceback (most recent call last):
File "HiCKRy.py", line 283, in
main()
File "HiCKRy.py", line 276, in main
matrix,revFrag = loadfastfithicInteractions(args.interactions, args.fragments)
File "HiCKRy.py", line 45, in loadfastfithicInteractions
x.append(fragDic[chrom1][mid1])
KeyError: 45000

@ay-lab
Copy link
Owner

ay-lab commented Apr 12, 2021

I think you have fixed this issue but this was again about the mismatch between two different input files. your contacts file did not list midpoints as columns 2 and 4.

@DaianeH
Copy link

DaianeH commented Aug 15, 2022

I'm getting the same key errors.

My contact file looks like:

10 100005000 10 100005000 17
10 100005000 10 100015000 19
10 100005000 10 100025000 3
10 100005000 10 100035000 3
10 100005000 10 100045000 6
10 100005000 10 100055000 7
10 100005000 10 100065000 2
10 100005000 10 100075000 2
10 100005000 10 100095000 1
10 100005000 10 100105000 6

The fragment file looks like:

1 0 5000 1 1
1 10000 15000 1 1
1 20000 25000 1 1
1 30000 35000 1 1
1 40000 45000 1 1
1 50000 55000 1 1
1 60000 65000 1 1
1 70000 75000 1 1
1 80000 85000 1 1
1 90000 95000 1 1

What can the problem be?

@DaianeH
Copy link

DaianeH commented Aug 19, 2022

Just to be more precise, the error is:

Creating sparse matrix...
Traceback (most recent call last):
File "fithic/fithic/utils/HiCKRy.py", line 283, in
main()
File "fithic/fithic/utils/HiCKRy.py", line 276, in main
matrix,revFrag = loadfastfithicInteractions(args.interactions, args.fragments)
File "fithic/fithic/utils/HiCKRy.py", line 46, in loadfastfithicInteractions
y.append(fragDic[chrom2][mid2])
KeyError: 82155000

@ay-lab
Copy link
Owner

ay-lab commented Aug 19, 2022

You have this midpoint "82155000" in your contacts file (not sure which chr, so you may want to run chr by chr, or put a print statement in the code) but seems like you do not have it listed in the fragments file. That is the error.

@DaianeH
Copy link

DaianeH commented Aug 19, 2022

I created this fragments file using:

python createFitHiCFragments-fixedsize.py --chrLens fithic_protocol_data/data/referenceGenomes/hg19wY-lengths --resolution 10000 --outFile myfile.fragmentsfile.gz

The contacts file was created from a validPairs file with:

sh validPairs2FitHiC-fixedSize.sh 10000 myfile myfile_validPairs.txt .

Is that correct?
And if these files were created correctly, how to solve the key error?

Thank you,

@ay-lab
Copy link
Owner

ay-lab commented Aug 19, 2022

They look correct. We need to see the full contacts file and fragments file to help you unless you can trace back the entry with midpoint 82155000 yourself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants