Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparkContext doesn't appear to be present after install on ESXi #17

Closed
wesleyraptor opened this issue Jan 17, 2018 · 4 comments
Closed

Comments

@wesleyraptor
Copy link

wesleyraptor commented Jan 17, 2018

I installed this in AWS the other day with no issues. Trying to install on ESXi in a virtual lab and I've hit a snag twice in a row now where the Spark UI doesn't come up. I don't think it's an environmental issue - DNS, HTTP, and HTTPS all work. Did not see any errors whatsoever in the helk-install log.

Installed on Ubuntu Server 16.04.2 amd64 xenial VM. Pulled the image from dockerhub and ran it per the install script. I'm getting Kibana UI and Jupyter notebook (ports 80 and 8880 respectively) but nothing on 4040 for the Spark UI. I looked in the container and noticed the ESXI deployment was missing this process:

AWS:
image

I tried to run the command verbatim on my ESXI deployment inside the container and got this stack trace:
image

I don't have much experience with Spark so there's a chance I'm way off. I guess my question is has anyone had success installing on a VM in ESXI? Thanks!

@Cyb3rWard0g
Copy link
Owner

Good morning @wesleyraptor , when you installed the image in AWS, did you test the Jupyter Notebook first (opening the one I created) before accessing the UI on port 4040??. If so, I think the issue is that when there are no kernels initiated (spark and Jupyter working together), Spark context does not get initiated. When you installed it in ESXI, did you go straight to the Spark UI?? Did you open the Jupyter notebook I have in the repo first?? I think that would be a first step to troubleshoot this issue. Also, can you try to install the HELK via the DockerFile. Let me know if you have the same issues with it. I will take a look at the PYSPARK_DRIVER_CALLBACK_HOST error that you have too. thank you.

@wesleyraptor
Copy link
Author

Loading the sparkcontext object in the notebook solved the problem. Sorry for my ignorance!

@Cyb3rWard0g
Copy link
Owner

@wesleyraptor not at all man. It is all good. It is actually my fault for not adding that detail to it. I am also learning Spark as I go and I didnt pay attention to that extra step that needs to happen to enable the Spark Context. I will add that in the section where I provide information about the Spark UI. I will add "before accessing the Spark UI, make sure you start a new jupyter notebook or run the "Check_Spark_Graphframes_Integrations.ipynb" notebook". "That should start spark and spark context" (Spark-Jupyter Integration). Thank you for your help on this one too and for helping the project update its documentation) 👍 . @wesleyraptor I would like to also hear more about your use cases with the HELK and help in any way I can if needed.

@Cyb3rWard0g
Copy link
Owner

Added Spark UI details in the Installation section of the Wiki: https://github.com/Cyb3rWard0g/HELK/wiki/Installation Thank you @wesleyraptor ! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants