Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2. Setup your browser, jupyter environment & connect to the master node #27

Closed
7 tasks done
fatse opened this issue Apr 20, 2021 · 1 comment
Closed
7 tasks done
Labels
Milestone

Comments

@fatse
Copy link
Collaborator

fatse commented Apr 20, 2021

rubric={correctness:25}

  • 2.1) Under cluster summary > Application user interfaces > On-cluster user interfaces: Click on Enable an SSH Connection.
  • 2.2) From instructions in the popup from Step 2.1, use: Step 1: Open an SSH Tunnel to the Amazon EMR Master Node. Remember you are running this from your laptop terminal, and after running, it will look like this.
  • 2.3) From instructions in the popup from Step 2.1, please ignore Step 2: Configure a proxy management tool. Instead follow instructions given here, under section Example: Configure FoxyProxy for Firefox:. Get foxyproxy standard here
  • 2.4) Move to application user interfaces tab, use the jupytetHub URL to access.
  • 2.4.1) Username: jovyan, Password :jupyter. These are default more details here
  • 2.5)[ OPTIONAL ] Remember, we are using EMR managed jupyterHub, and the setup they have is different from TLJH. So before you add users in jupyterHub, run this by SSHing into the master node. Follow the instruction cluster summary > Connect to the Master Node Using SSH. Remember, you are running this from your laptop terminal. Once you get inside the server/instance, add your team members.
 sudo docker exec jupyterhub useradd -m -s /bin/bash -N <your team member IAM id>
 sudo docker exec jupyterhub bash -c "echo <your team member IAM id>:<your team member password> | chpasswd"
  • 2.6) Login into the master node from your laptop terminal (cluster summary > Connect to the Master Node Using SSH), and install necessary packages. Here are needed packages based on the solution that I have; you might have to install other packages depending on your approach.
sudo yum install python3-devel
sudo pip3 install pandas
sudo pip3 install s3fs

IMPORTANT: Make sure ssh -i ~/ggeorgeAD.pem -ND 8157 hadoop@xxxxx.compute.amazonaws.com is running in your terminal window before trying to access your jupyter URL. Sometimes the connection might lose; in that case run that step again to access your jupyterHub.

Please attach this screen shots from your group for grading
https://github.ubc.ca/MDS-2020-21/DSCI_525_web-cloud-comp_students/blob/master/Milestones/milestone3/images/Task2.png

@fatse fatse added the specs label Apr 20, 2021
@fatse fatse added this to the milestone3 milestone Apr 20, 2021
@fatse
Copy link
Collaborator Author

fatse commented Apr 20, 2021

Screen Shot 2021-04-20 at 3 40 50 PM

@fatse fatse mentioned this issue Apr 21, 2021
4 tasks
@fatse fatse closed this as completed in 4f7a063 Apr 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant