Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble reading SRA CRAM files #1254

Closed
outpaddling opened this issue May 21, 2020 · 21 comments · Fixed by samtools/htslib#1105
Closed

Trouble reading SRA CRAM files #1254

outpaddling opened this issue May 21, 2020 · 21 comments · Fixed by samtools/htslib#1105

Comments

@outpaddling
Copy link

I'm getting these "Unable to fetch reference" errors when trying to run samtools view on a GCP instance:

Fusera is ready!
Remember, Fusera needs to keep running in order to serve your files, don't close this terminal!
NWD102903.b38.irc.v1.cram	NWD102903.freeze5.v1.vcf.gz
NWD102903.b38.irc.v1.cram.crai	NWD102903.freeze5.v1.vcf.gz.csi
/usr/local/bin/samtools
[E::cram_get_ref] Failed to populate reference for id 0
[E::cram_decode_slice] Unable to fetch reference #0 9997..44254
[E::cram_next_slice] Failure to decode slice
[main_samview] truncated file.

Here's the code producing the output above:

fusera mount -t ../SRA/prj_13558_D25493.ngc -a $srr $mount_dir &
while [ ! -e $mount_dir/$srr/$sample.b38.irc.v1.cram ]; do
	printf "Waiting for fusera...\n"
	sleep 1
done
ls $mount_dir/$srr
which samtools
samtools view $mount_dir/$srr/$sample.b38.irc.v1.cram | more

I can access the CRAM files with other tools, e.g. cat to /dev/null.

The exact same samtools builds and CRAM files work on a local machine to which I've downloaded a few of the CRAM files.

I've tried CentOS 8 and CentOS 7 on the GCP instances. On CentOS 8 I've tried samtools 1.9 built from scratch and by pkgsrc, as well as samtools 1.10 from pkgsrc.

Locally I can process the same CRAM files under CentOS 7 and FreeBSD without any issues. I've stripped the command down to the most basic possible and still consistently get this error.

All systems are up-to-date with the latest patches.

Any ideas what might cause this?

Thanks much.

Jason
@outpaddling
Copy link
Author

More info:
I can trigger the same error message plus an earlier one from the reference download attempt on our local cluster by removing rootcerts so that libcurl cannot validate the reference URL:

~/Data/TOPMed 1006: samtools view SRR6990379/NWD104303.b38.irc.v1.cram | more
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd": Input/output error
[E::cram_get_ref] Failed to populate reference for id 0
[E::cram_decode_slice] Unable to fetch reference #0 10000..33951
[E::cram_next_slice] Failure to decode slice
[main_samview] truncated file.

But the GCP instance has no trouble accessing the URL directly with curl, so it's not a routing or firewall issue:

$ curl https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 25  237M   25 59.8M    0     0  6331k      0  0:00:38  0:00:09  0:00:29 10.3M^C

I then tried uploading ~/.cache/hts-ref from the local cluster to the GCP instance to see if it would work around the issue, but I still get the same error.

@outpaddling
Copy link
Author

My last comment about uploading an hts-ref directory was incorrect. It actually does work around the issue. I had set REF_PATH in a previous attempt to fix the issue and forgot to remove it, so samtools was ignoring my ~/.cache/hts-ref directory.
Without an hts-ref directory built elsewhere, samtools is now communicating with the EBI site rather than producing the same error as before, but it does not seem to be making any progress. I do not know what caused this change in behavior. iftop shows only a few KB/sec traffic in either direction and doesn't accomplish anything even after a couple hours, whereas it only takes a couple minutes to download the necessary references on my local machine and push them to the cloud instance.
It's a usable workaround, so this is no longer an urgent issue for me.

@outpaddling
Copy link
Author

Now I'm getting the same error on my FreeBSD workstation where it was working flawlessly last week. Manually downloading from the command-line works fine, though. I'm using the script below as a workaround. It takes the URL straight from the error message

[W::find_file_url] Failed to read reference "https://www.ebi.ac.uk/ena/cram/md5/76635a41ea913a405ded820447d067b0": Input/output error
[E::cram_get_ref] Failed to populate reference for id 2
[E::cram_decode_slice] Unable to fetch reference #2 9998..39310
[bacon4000_gmail_com@topmed SRA]$ ./cache-hts-ref https://www.ebi.ac.uk/ena/cram/md5/76635a41ea913a405ded820447d067b0
https://www.ebi.ac.uk/ena/cram/md5/76635a41ea913a405ded820447d067b0
/home/bacon4000_gmail_com/.cache/hts-ref/76/63/5a41ea913a405ded820447d067b0
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  189M  100  189M    0     0  8757k      0  0:00:22  0:00:22 --:--:-- 11.4M
#!/bin/sh -e

##########################################################################
#   Script description:
#       Download and cache a reference for samtools
#       
#   History:
#   Date        Name        Modification
#   2020-06-05              Begin
##########################################################################

usage()
{
    printf "Usage: $0 url\n"
    exit 1
}


##########################################################################
#   Main
##########################################################################

if [ $# != 1 ]; then
    usage
fi
url="$1"

echo $url
hash=$(echo $1 | cut -d / -f 7)
gpdir=$(echo $hash | cut -c 1-2)
pdir=$(echo $hash | cut -c 3-4)
fname=$(echo $hash | cut -c 5-)
dir="$HOME/.cache/hts-ref/$gpdir/$pdir"
printf "$dir/$fname\n"
mkdir -p "$dir"
curl -o "$dir/$fname" "$url"

@whitwham
Copy link
Contributor

whitwham commented Jun 9, 2020

Hello Jason,

Sorry for not replying sooner. With samtools 1.10 you can turn libcurl logging which should give us more information on what is going on. Could you try samtools view --verbosity=8 ... and see what happens?

@outpaddling
Copy link
Author

outpaddling commented Jun 30, 2020

Good timing for this new feature. Here's my output from a samtools 1.10 run. Doesn't mean much to me, but maybe suggests a problem communicating with the EBI squid server?

Running curl outside samtools still works fine.

Thanks,

JB

Linux pbulkc7.hpc  bacon ~ 247: (pkgsrc): curl -O https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  237M  100  237M    0     0  7818k      0  0:00:31  0:00:31 --:--:-- 5336k

Linux pbulkc7.hpc  bacon ~ 246: (pkgsrc): samtools view --verbosity=8 NWD102903.b38.irc.v1.cram 
[I::cram_populate_ref] Running cram_populate_ref on fd 0x1af4130, id 0
[I::cram_populate_ref] Populating local cache: /home/bacon/.cache/hts-ref/%2s/%2s/%s
[I::cram_populate_ref] Querying ref 6aef897c3d6ff0c78aff06ac189178dd
[D::init_add_plugin] Loaded "knetfile"
[D::init_add_plugin] Loaded "mem"
[D::init_add_plugin] Loaded "libcurl"
[D::init_add_plugin] Loaded "gcs"
[D::init_add_plugin] Loaded "s3"
[D::init_add_plugin] Loaded "s3w"
*   Trying 193.62.193.80:443...
* Connected to www.ebi.ac.uk (193.62.193.80) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /home/bacon/Pkgsrc/pkg/etc/openssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: jurisdictionC=GB; businessCategory=Government Entity; serialNumber=Government Entity; C=GB; ST=Essex; L=Saffron Walden; O=European Bioinformatics Institute; CN=www.ebi.ac.uk
*  start date: May  5 11:33:46 2020 GMT
*  expire date: May  5 11:43:00 2022 GMT
*  subjectAltName: host "www.ebi.ac.uk" matched cert's "www.ebi.ac.uk"
*  issuer: C=BM; O=QuoVadis Limited; CN=QuoVadis EV SSL ICA G3
*  SSL certificate verify ok.
> GET /ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd HTTP/1.1
Host: www.ebi.ac.uk
User-Agent: htslib/1.10.2 libcurl/7.70.0
Accept: */*

* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Cache-Control: max-age=31536000
< X-Cache: HIT from pg-ena-cram-1.ebi.ac.uk
< Content-Type: text/plain
< Strict-Transport-Security: max-age=0
< Date: Sun, 28 Jun 2020 04:11:06 GMT
< X-Cache-Lookup: HIT from pg-ena-cram-1.ebi.ac.uk:8080
< Via: 1.0 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23)
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
< Age: 219359
< Warning: 113 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23) This cache hit is still fresh and more than 1 day old
< Content-Length: 248956422
< 
* Connection #0 to host www.ebi.ac.uk left intact
[W::find_file_url] Failed to read reference "https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd": Input/output error
[E::cram_get_ref] Failed to populate reference for id 0
[E::cram_decode_slice] Unable to fetch reference #0 9997..44254

[E::cram_next_slice] Failure to decode slice
[main_samview] truncated file.

@daviesrob
Copy link
Member

It looks like samtools managed to contact the EBI server and got the HTTP headers back. Then something went wrong while it was trying to read the data. Input/output error is the catch-all for any errors we weren't expecting, so it's still a mystery as to exactly whats going one. Some extra logging in the bit of code that handles the error might help.

Does the error happen all the time, or is it intermittent? Trying curl -v might be interesting as well, to see if it's doing anything special to get the data (e.g. resuming a partial download).

@outpaddling
Copy link
Author

It happens every time. Here's some verbose curl output. Tell you anything?

Thanks,

JB

Linux pbulkc7.hpc  bacon ~ 310: (pkgsrc): curl -v -O https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 193.62.193.80:443...
* Connected to www.ebi.ac.uk (193.62.193.80) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /home/bacon/Pkgsrc/pkg/etc/openssl/certs
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [91 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [5124 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: jurisdictionC=GB; businessCategory=Government Entity; serialNumber=Government Entity; C=GB; ST=Essex; L=Saffron Walden; O=European Bioinformatics Institute; CN=www.ebi.ac.uk
*  start date: May  5 11:33:46 2020 GMT
*  expire date: May  5 11:43:00 2022 GMT
*  subjectAltName: host "www.ebi.ac.uk" matched cert's "www.ebi.ac.uk"
*  issuer: C=BM; O=QuoVadis Limited; CN=QuoVadis EV SSL ICA G3
*  SSL certificate verify ok.
} [5 bytes data]
> GET /ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd HTTP/1.1
> Host: www.ebi.ac.uk
> User-Agent: curl/7.70.0
> Accept: */*
> 
{ [5 bytes data]
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Cache-Control: max-age=31536000
< X-Cache: HIT from pg-ena-cram-1.ebi.ac.uk
< Content-Type: text/plain
< Strict-Transport-Security: max-age=0
< Date: Sun, 28 Jun 2020 04:11:06 GMT
< X-Cache-Lookup: HIT from pg-ena-cram-1.ebi.ac.uk:8080
< Via: 1.0 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23)
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
< Age: 493608
< Warning: 113 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23) This cache hit is still fresh and more than 1 day old
< Content-Length: 248956422
< 
{ [3807 bytes data]
100  237M  100  237M    0     0  8956k      0  0:00:27  0:00:27 --:--:-- 8125k
* Connection #0 to host www.ebi.ac.uk left intact

@whitwham
Copy link
Contributor

whitwham commented Jul 7, 2020

Looks like the same output I get when I run the same command. There is nothing that stands out to me.

@outpaddling
Copy link
Author

Can you reproduce the issue with samtools? Maybe grab a CRAM from SRA to keep the variable to a minimum.

@whitwham
Copy link
Contributor

whitwham commented Jul 9, 2020

Can you reproduce the issue with samtools? Maybe grab a CRAM from SRA to keep the variable to a minimum.

I don't think we have access to any GCP instances.

@outpaddling
Copy link
Author

It's not limited to GCP like I originally thought. As I mentioned earlier, I'm seeing the same error on my FreeBSD workstation and the recent debugging output is all from a pristine local CentOS 7 VM. Running curl directly works fine in all environments.

@whitwham
Copy link
Contributor

whitwham commented Jul 9, 2020

My mistake, I was fixated on the GCP aspect. Do you have a suggested cram file I can download?

@whitwham
Copy link
Contributor

So I tried replicating your error on my home laptop by inserting the M5 tag into a tiny test file. It all seemed to work for me. I'll include the output, maybe you can spot something I've missed.

samtools view --verbosity=8 -h 5_markdup.cram

[I::cram_populate_ref] Running cram_populate_ref on fd 0x5606d44c7850, id 0
[I::cram_populate_ref] Populating local cache: /home/aaw/.cache/hts-ref/%2s/%2s/%s
[I::cram_populate_ref] Querying ref 6aef897c3d6ff0c78aff06ac189178dd
[D::init_add_plugin] Loaded "knetfile"
[D::init_add_plugin] Loaded "mem"
[D::init_add_plugin] Loaded "libcurl"
[D::init_add_plugin] Loaded "gcs"
[D::init_add_plugin] Loaded "s3"
[D::init_add_plugin] Loaded "s3w"

  • Trying 193.62.193.80:443...
  • TCP_NODELAY set
  • Connected to www.ebi.ac.uk (193.62.193.80) port 443 (#0)
  • ALPN, offering h2
  • ALPN, offering http/1.1
  • successfully set certificate verify locations:
  • CAfile: none
    CApath: /etc/ssl/certs
  • SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
  • ALPN, server did not agree to a protocol
  • Server certificate:
  • subject: jurisdictionC=GB; businessCategory=Government Entity; serialNumber=Government Entity; C=GB; ST=Essex; L=Saffron Walden; O=European Bioinformatics Institute; CN=www.ebi.ac.uk
  • start date: May 5 11:33:46 2020 GMT
  • expire date: May 5 11:43:00 2022 GMT
  • subjectAltName: host "www.ebi.ac.uk" matched cert's "www.ebi.ac.uk"
  • issuer: C=BM; O=QuoVadis Limited; CN=QuoVadis EV SSL ICA G3
  • SSL certificate verify ok.

GET /ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd HTTP/1.1
Host: www.ebi.ac.uk
User-Agent: htslib/1.10 libcurl/7.65.3
Accept: /

  • Mark bundle as not supporting multiuse
  • HTTP 1.0, assume close after body
    < HTTP/1.0 200 OK
    < Cache-Control: max-age=31536000
    < X-Cache: HIT from pg-ena-cram-1.ebi.ac.uk
    < Content-Type: text/plain
    < Strict-Transport-Security: max-age=0
    < Date: Sun, 28 Jun 2020 04:11:06 GMT
    < X-Cache-Lookup: HIT from pg-ena-cram-1.ebi.ac.uk:8080
    < Via: 1.0 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23)
  • HTTP/1.0 connection set to keep alive!
    < Connection: keep-alive
    < Age: 1312778
    < Warning: 113 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23) This cache hit is still fresh and more than 1 day old
    < Content-Length: 248956422
    <
  • Connection #0 to host www.ebi.ac.uk left intact
    [W::cram_populate_ref] Creating reference cache directory /home/aaw/.cache/hts-ref
    This may become large; see the samtools(1) manual page REF_CACHE discussion
    [I::cram_populate_ref] Writing cache file '/home/aaw/.cache/hts-ref/6a/ef/897c3d6ff0c78aff06ac189178dd'

@whitwham
Copy link
Contributor

I don't think we are going to get anywhere without knowing what error is being thrown. We could change http_status_errno and easy_error (and possibly multi_errno) in hfile_curl.c to print out the error number.

@outpaddling
Copy link
Author

outpaddling commented Jul 13, 2020

Regarding your earlier question, all of our CRAMs are restricted access from the Women's Health Initiative on SRA. I'm not sure if there are equivalent CRAMs that you would be able to access. Might be worth exploring...

As for debugging, I took your suggested and patched a couple of fprintfs into the latest htslib commit:

https://github.com/outpaddling/freebsd-ports-wip/tree/master/htslib

Running this with the latest samtools commit produced the following. It shows that easy_errno() is receiving a code of 43. From where I'm not sure yet. There are a lot of calls to this function...

But I found this in curl.h:

CURLE_BAD_FUNCTION_ARGUMENT, /* 43 */

  • Trying 193.62.193.80:443...
  • Connected to www.ebi.ac.uk (193.62.193.80) port 443 (#0)
  • ALPN, offering h2
  • ALPN, offering http/1.1
  • successfully set certificate verify locations:
  • CAfile: /usr/local/share/certs/ca-root-nss.crt
    CApath: none
  • SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
  • ALPN, server did not agree to a protocol
  • Server certificate:
  • subject: jurisdictionC=GB; businessCategory=Government Entity; serialNumber=Government Entity; C=GB; ST=Essex; L=Saffron Walden; O=European Bioinformatics Institute; CN=www.ebi.ac.uk
  • start date: May 5 11:33:46 2020 GMT
  • expire date: May 5 11:43:00 2022 GMT
  • subjectAltName: host "www.ebi.ac.uk" matched cert's "www.ebi.ac.uk"
  • issuer: C=BM; O=QuoVadis Limited; CN=QuoVadis EV SSL ICA G3
  • SSL certificate verify ok.

GET /ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd HTTP/1.1
Host: www.ebi.ac.uk
User-Agent: htslib/1.10.2 libcurl/7.71.0
Accept: /

  • Mark bundle as not supporting multiuse
  • HTTP 1.0, assume close after body
    < HTTP/1.0 200 OK
    < Cache-Control: max-age=31536000
    < X-Cache: HIT from pg-ena-cram-1.ebi.ac.uk
    < Content-Type: text/plain
    < Strict-Transport-Security: max-age=0
    < Date: Sun, 28 Jun 2020 04:11:06 GMT
    < X-Cache-Lookup: HIT from pg-ena-cram-1.ebi.ac.uk:8080
    < Via: 1.0 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23)
  • HTTP/1.0 connection set to keep alive!
    < Connection: keep-alive
    < Age: 1350752
    < Warning: 113 pg-ena-cram-1.ebi.ac.uk (squid/3.1.23) This cache hit is still fresh and more than 1 day old
    < Content-Length: 248956422
    <
  • Connection #0 to host www.ebi.ac.uk left intact
    easy_errno: err = 43
    easy_errno: err = 43
    [W::find_file_url] Failed to read reference "https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd": Input/output error
    [E::fai_build3_core] Failed to open the file /net/mario/nodeDataMaster/local/ref/gotcloud.ref/hg38/hs38DH.fa
    [E::refs_load_fai] Failed to open reference file '/net/mario/nodeDataMaster/local/ref/gotcloud.ref/hg38/hs38DH.fa'
    [E::cram_get_ref] Failed to populate reference for id 0
    [E::cram_decode_slice] Unable to fetch reference #0:9997-44254

[E::cram_next_slice] Failure to decode slice
[main_samview] truncated file.

@outpaddling
Copy link
Author

I'm wondering if this is a curl problem or if something is failing after the download. The download runs for a long time and I can see activity under iftop or netstat that looks similar to what I get from a manual curl/wget/fetch.
And what does this /net/mario/... path refer to? Is that something embedded in the CRAM file?

@jmarshall
Copy link
Member

User-Agent: htslib/1.10.2 libcurl/7.71.0

It shows that easy_errno() is receiving a code of 43 [CURLE_BAD_FUNCTION_ARGUMENT].

Taken together, these indicate that this is indeed the same problem as #1284. So when htslib is built against a sufficiently-recent libcurl, it can be reproduced with essentially any files accessed over http(s).

@outpaddling
Copy link
Author

I just installed curl-7.67.0 from the FreeBSD ports history and this eliminated the problem. The same version is installed in an old pkgsrc tree on our CentOS cluster and samtools works there as well. I assume it won't be hard to fix htslib to work with the latest libcurl, but we have a workaround for now in any case.

@daviesrob
Copy link
Member

HTSlib pull request samtools/htslib#1105 should fix this. Would it be possible to give it a try on your systems?

@outpaddling outpaddling changed the title Trouble reading SRA CRAM files in Google Cloud buckets Trouble reading SRA CRAM files Jul 14, 2020
@outpaddling
Copy link
Author

I applied the pull request to commit 9c357445... on a FreeBSD server with curl 7.71.0. So far so good: It successfully cached the first reference for a large CRAM file, which it has been unable to do for some time.
Is a new release with this fix likely soon? If not, I'll plan to add a patch to the 1.10.2 FreeBSD port and pkgsrc frameworks for now.

@bw2
Copy link

bw2 commented Mar 25, 2021

For anyone else that runs into this issue, the following worked for me:

wget https://storage.googleapis.com/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta
wget https://storage.googleapis.com/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai 
wget https://storage.googleapis.com/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.ref_cache.tar.gz

tar xzf Homo_sapiens_assembly38.ref_cache.tar.gz
export REF_PATH="$(pwd)/ref/cache/%2s/%2s/%s:http://www.ebi.ac.uk/ena/cram/md5/%s"
export REF_CACHE="$(pwd)/ref/cache/%2s/%2s/%s"

...

{command that was generating the '[W::find_file_url] Failed to open reference..' error}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants