Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic while downloading siva files to HDFS #33

Closed
rporres opened this issue Mar 7, 2018 · 6 comments
Closed

Panic while downloading siva files to HDFS #33

rporres opened this issue Mar 7, 2018 · 6 comments
Assignees

Comments

@rporres
Copy link

rporres commented Mar 7, 2018

Testing with a build done using a binary from #32 (I could not connect to HDFS otherwise), pga tool is failing to get files to HDFS

# cat /siva.txt | ./pga-rafa get --verbose -i -o hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
downloading siva files by name from stdin
filter flags will be ignored
DEBU[0000] syncing http://pga.sourced.tech//siva/latest/4a/4a14cc02da0a9280538cd3f3242365601d72f241.siva to hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020/siva/latest/4a/4a14cc02da0a9280538cd3f3242365601d72f241.siva
panic: runtime error: slice bounds out of range

goroutine 1 [running]:
github.com/src-d/datasets/PublicGitArchive/pga/cmd.downloadFilenames(0x86d4e0, 0xc4201ca080, 0x86d560, 0xc4201b8030, 0xc4201dc000, 0x1f, 0x1f, 0xa, 0x8000105, 0x0)
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/cmd/get.go:91 +0x263
github.com/src-d/datasets/PublicGitArchive/pga/cmd.glob..func1(0xa67920, 0xc4200b6340, 0x0, 0x4, 0x0, 0x0)
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/cmd/get.go:79 +0x3a8
github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra.(*Command).execute(0xa67920, 0xc4200b6300, 0x4, 0x4, 0xa67920, 0xc4200b6300)
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra/command.go:698 +0x46d
github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xa67d60, 0xc4200abf58, 0x73fc95, 0xc4201763c0)
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra/command.go:783 +0x2e4
github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra.(*Command).Execute(0xa67d60, 0xc42002a0b8, 0x0)
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/vendor/github.com/spf13/cobra/command.go:736 +0x2b
github.com/src-d/datasets/PublicGitArchive/pga/cmd.Execute()
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/cmd/root.go:34 +0x2d
main.main()
	/root/go/src/github.com/src-d/datasets/PublicGitArchive/pga/main.go:8 +0x20

Find attached the contents of siva.txt

For the moment I'm using multitool to download to HDFS as it is not giving me issues

cc @vmarkovtsev

@campoy
Copy link
Contributor

campoy commented Mar 9, 2018

do we have an hdfs server I can use for testing? or a guide explaining how to get an hdfs server running on docker?

@eiso
Copy link
Member

eiso commented Mar 10, 2018

This seems to be the same -i error I was having, not related to hdfs itself.

@rporres
Copy link
Author

rporres commented Mar 12, 2018

@campoy: You can deploy your own HDFS in minikube using https://github.com/apache-spark-on-k8s/kubernetes-HDFS/tree/master/charts

Alternatively you can access our pipeline HDFS server. Ping me to give you instructions.

@jfontan
Copy link
Contributor

jfontan commented Mar 12, 2018

I use this to start HDFS + Spark locally:

https://github.com/jfontan/spark-docker-compose/blob/master/engine.md

@campoy
Copy link
Contributor

campoy commented Mar 14, 2018

HDFS on minikube sounds perfect, I'll give it a try

@campoy
Copy link
Contributor

campoy commented Apr 6, 2018

Took a long time, but I was able to finally connect to an HDFS server using Google Cloud Dataproc and run all my tests.

See #44

@campoy campoy closed this as completed in #44 Apr 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants