## General

This notebook documents and provides scripts for managing a spark cluster.

The cluster is started using the spark-notebook script created by Kevin Coakley which is based on a script create by Julaiti Alafat. The advantage of Coakley's script is that it uses AWS-EMR instead of directly managing ec2 instances. Clone Coakley's script using 
```sh
git clone https://github.com/mas-dse/spark-notebook.git
```

Follow directions to initialize and start the web interface. The command you use to start the cluster spinner is 
```
./run.py &
```
The first time you use the script you will need to enter your AWS credentials. Those will be kept in a yaml file for the next times.

## Bootstrap
After the script starts the cluster, it executes a bootstrap script. The script is in the spark notebook directory in the file `provision/jupyter-provision-v0.4.sh`

## Per-session bootstrap
You can add an additional script that will be executed after the general bootstrap. This script will be executed on both the head node and the worker nodes. In order to restrict the execution to the head node surround your commands with the following if-then:
```sh
# check for master node
if grep isMaster /mnt/var/lib/info/instance.json | grep true;
then
   #put here commands that are intended only for the head node
fi
```
An example script is below:

```sh
# %load PrivateBootstrap.sh
# check for master node
if grep isMaster /mnt/var/lib/info/instance.json | grep true;
then
   cd /mnt/workspace/

   date +%H.%M:%S:%N  #>> /mnt/workspace/PrivateBootstrap.log
   echo “Start of bootsrap, set up git” #>> /mnt/workspace/PrivateBootstrap.log
   git config --global user.email "yoav.freund@gmail.com"
   git config --global user.name “Yoav Freund”
   git config --global credential.helper cache
   echo "git clone https://github.com/ucsd-edx/edX-Micro-Master-in-Data-Science.git" >clone.sh    # could not figure a way to clone withut user intervensin, 
   # so making the clone into a one line script that need to be executed manually.

   date +%H.%M:%S:%N  #>> /mnt/workspace/PrivateBootstrap.log
   echo “copy files from S3 to Local”  #>> /mnt/workspace/PrivateBootstrap.log
   mkdir Data
   cd Data
   aws s3 cp --recursive s3://dse-weather/weather.parquet  ./weather.parquet

   date +%H.%M:%S:%N  #>> /mnt/workspace/PrivateBootstrap.log
   echo “copy files from Local to HDFS”  #>> /mnt/workspace/PrivateBootstrap.log
   hadoop fs -mkdir /weather
   hadoop fs -copyFromLocal weather.parquet /weather/weather.parquet

   date +%H.%M:%S:%N  #>> /mnt/workspace/PrivateBootstrap.log
   echo “Bootstrap done”  #>> /mnt/workspace/PrivateBootstrap.log
fi
```

The cluster nodes will recieve the script from s3. You therefor need to copy the script into s3 before starting the cluster. You can use the AWS command line (which you need to install on your laptop) to copy a local script to an s3 bucket that is accessible using the credentials f the cluster:
```sh
aws s3 cp PrivateBootstrap.sh s3://dse-weather/PrivateBootstrap.sh
```
You then need to type the s3 location when you start the script under "advanced options"

## log files

At the bottom of the spark-notebook page, before you start a cluster, there is a line of the form:
```
EMR Logs S3 Bucket [?] s3://aws-logs-846273844940-us-east-1
```
This line tells you the s3 bucket where the logs reside.

It is not easy to find out which of the logs are related to your current cluster and which are left over from previous runs. I wrote some code here to help with that.

First, we get a listing of all of the files in the bucket

In [10]:
logs_bucket="aws-logs-846273844940-us-east-1"
!aws s3 ls --recursive $logs_bucket/ > logOfLogs

In [11]:
# I now grep for today's date
import datetime as dt
now=dt.datetime.now()
now.day
#dt.datetime.strptime

18

In [13]:
!grep "2018-02-19" logOfLogs

2018-02-18 20:31:37       9258 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-249-61.log.2018-02-19-03.gz
2018-02-18 19:21:37      24069 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/daemons/instance-state/instance-state.log-2018-02-19-03-15.gz
2018-02-18 19:41:37      23805 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/daemons/instance-state/instance-state.log-2018-02-19-03-30.gz
2018-02-18 19:56:37      23567 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/daemons/instance-state/instance-state.log-2018-02-19-03-45.gz
2018-02-18 20:06:37      22687 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/daemons/instance-state/instance-state.log-2018-02-19-04-00.gz
2018-02-18 20:21:37      24043 elasticmapreduce/j-1CRVRCKP9FR84/node/i-00558bee8b5f6e55d/daemons/instance-state/instance-state.log-2018-02-19-04-15.gz
2018-02-18 20:41:37      23575 elasticmapreduce/j-1CRVRCKP9FR84/no

In [4]:
i=0
import re
rp=r''
pat=re.compile(r'(\d+-\d+-\d+\s+\d+:\d+:\d+)\s+(\d+)\s+([^/]+/)([^/]+)/(.*)')
from collections import Counter
C={}
with open('logOfLogs','r') as logs:
    for line in logs.readlines():
        m=pat.search(line)
        if m:
            timestamp,size,_dir,prefix,file=m.groups()
            #print(timestamp,size,prefix,file)

            ts=dt.datetime.strptime(timestamp,'%Y-%m-%d %H:%M:%S')
            if now.year==ts.year and now.month==ts.month and ts.day==19:
                if prefix in C:
                    C[prefix].append(ts)
                else:
                    C[prefix]=[ts]
            i+=1
print("A listing of today's logs\n")
print(" session\t Started\t\t Ended \t\t\t No. of files")
for prefix in C.keys():
    print('%s\t%s\t%s\t %d'%(prefix,min(C[prefix]),max(C[prefix]),len(C[prefix])))

## to do  : print logs in order of start time
print("_dir=",_dir)

A listing of today's logs

 session	 Started		 Ended 			 No. of files
_dir= <built-in function dir>


In [9]:
ts.day==28

False

### download a specific session for inspection

In [10]:
current='j-JU6TGVTKCC9Y'
s3path='s3://'+logs_bucket+'/'+_dir+current+'/'
print(s3path)
%cd /tmp
!aws s3 cp --recursive $s3path $current

s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/
/private/tmp
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-205.log.gz to j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-205.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/bootstrap-actions/1/controller.gz to j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/bootstrap-actions/1/controller.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-205.log.2018-02-18-22.gz to j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-205.log.2018-02-18-22.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/setup-devices/setup_var_log_dir.log.gz to j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/setup-devices/setup_var_log_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-00ef5ccc42effdfa1/setup-devices/setup_var_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-228-145.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-228-145.log.2018-02-18-21.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-228-145.out.gz to j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/applications/ha

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/setup-devices/setup_var_lib_dir.log.gz to j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/setup-devices/setup_var_lib_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-207.out.gz to j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-207.out.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-207.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-207.log.2018-02-18-21.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0288ea472349f126e/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-0

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/setup-devices/setup_var_log_dir.log.gz to j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/setup-devices/setup_var_log_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-02fe4b250d74129e3/setup-devices/setup_var_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-241.log.gz to j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-241.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-241.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/applications/hadoop-hdfs/hado

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_var_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-143.log.gz to j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-239-143.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_var_log_dir.log.gz to j-JU6TGVTKCC9Y/node/i-02ff8b34babf37087/setup-devices/setup_var_log_dir.log.gz
download: s3://aws-logs-846273844940-us-east

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-046f6a88e76cd30e0/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-190.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-046f6a88e76cd30e0/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-190.log.2018-02-18-21.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/setup-devices/setup_var_log_dir.log.gz to j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/setup-devices/setup_var_log_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-0371fc8754bf345c5/setup-devices/setup_var_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-046f6a88e76cd30e0/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-248-190.log.2018-02-18-23.gz to j-JU6TGVTKCC9Y/node/i-046f6a88e76cd30e0/a

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-21.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-22.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-22.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-23.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/applications/hadoop-hdfs/hadoop-hdfs-namenode-ip-10-129-253-102.log.2018-02-18-23.gz
download: s3://aws-logs-846273844940-us-east-1/el

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/controller.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/controller.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/stderr.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/stderr.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/stdout.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2/stdout.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/master.log.gz to j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/master.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/daemons/instance-state/instance-state.log-2018-02-18-22-

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-250-58.log.2018-02-18-21.gz to j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-250-58.log.2018-02-18-21.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-250-58.log.gz to j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-250-58.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/bootstrap-actions/1/controller.gz to j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/bootstrap-actions/1/controller.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/bootstrap-actions/2/controller.gz to j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-09ad5dff6a2e89c7a/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-251-153.out.gz to j-JU6TGVTKCC9Y/node/i-09ad5dff6a2e89c7a/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-251-153.out.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-09ad5dff6a2e89c7a/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-251-153.log.gz to j-JU6TGVTKCC9Y/node/i-09ad5dff6a2e89c7a/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-251-153.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/setup-devices/setup_var_tmp_dir.log.gz to j-JU6TGVTKCC9Y/node/i-07f2897a3ec61bb3d/setup-devices/setup_var_tmp_dir.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-09ad5dff6a2e89c7a/daemons/instance-state/instance-state.log-2018-02-18-21-45.gz to j-JU6TGVTKCC9Y/node/i-09ad

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.log.2018-02-18-22.gz to j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.log.2018-02-18-22.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.log.gz to j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.out.gz to j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-231-205.out.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0a3d6028144030752/b

download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-225-133.log.gz to j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-225-133.log.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-225-133.out.gz to j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-hdfs/hadoop-hdfs-datanode-ip-10-129-225-133.out.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-225-133.out.gz to j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-yarn/yarn-yarn-nodemanager-ip-10-129-225-133.out.gz
download: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/applications/hadoop-hdfs/

Completed 395 of 396 part(s) with 1 file(s) remainingdownload: s3://aws-logs-846273844940-us-east-1/elasticmapreduce/j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/setup-devices/setup_var_lib_dir.log.gz to j-JU6TGVTKCC9Y/node/i-0ddd3ad4127dfd0c4/setup-devices/setup_var_lib_dir.log.gz


In [19]:
#the top level partition is by the node name
!ls $current/node

[34mi-00ef5ccc42effdfa1[m[m [34mi-02ff8b34babf37087[m[m [34mi-06291bfec6ce584fd[m[m [34mi-0a3d6028144030752[m[m
[34mi-0288ea472349f126e[m[m [34mi-0371fc8754bf345c5[m[m [34mi-07f2897a3ec61bb3d[m[m [34mi-0ddd3ad4127dfd0c4[m[m
[34mi-02fe4b250d74129e3[m[m [34mi-046f6a88e76cd30e0[m[m [34mi-09ad5dff6a2e89c7a[m[m


In [26]:
# For each node there are the following directories of logs
# I currently care about bootstrap-actions for the master
# You find the master by finding the file `master.log.gz inside bootstap-actions
!ls $current/node/*/bootstrap-actions/master.log.gz

j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/master.log.gz


In [27]:
## We go to that directory to see what happened with our second bootsrtap
%cd j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/
!ls

/private/tmp/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions
[34m1[m[m             [34m2[m[m             master.log.gz


In [28]:
#going into log 2 which corresponds to our log
%cd 2
!ls -l

/private/tmp/j-JU6TGVTKCC9Y/node/i-06291bfec6ce584fd/bootstrap-actions/2
total 1304
-rw-r--r--  1 yoavfreund  wheel    1990 Feb 18 13:42 controller
-rw-r--r--  1 yoavfreund  wheel     840 Feb 18 13:42 controller.gz
-rw-r--r--  1 yoavfreund  wheel     344 Feb 18 13:42 stderr
-rw-r--r--  1 yoavfreund  wheel     205 Feb 18 13:42 stderr.gz
-rw-r--r--  1 yoavfreund  wheel  629431 Feb 18 13:42 stdout
-rw-r--r--  1 yoavfreund  wheel   19817 Feb 18 13:42 stdout.gz
