From 9686d5ba3be613deebad69e2a574e9c76e3e267b Mon Sep 17 00:00:00 2001
From: Kartik Nagappa <kartiknagappa@users.noreply.github.com>
Date: Tue, 7 Mar 2023 16:30:21 -0800
Subject: [PATCH] new EMR console

- Updated content with new EMR console screenshots
- Kept the old EMR console screenshots as well
- Restructured and made minor edits for readability / flow
---
 README.md | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)
diff --git a/README.md b/README.md
index 09facf5..efcd35d 100644
--- a/README.md
+++ b/README.md
@@ -35,26 +35,40 @@ The script relies on AWS CLI to retreive the data.
 ```
 
 `<cluster id>` is the cluster id that you are interested in parsing. The cluster id is prefixed with 'j-'.
+`<region>` represents [the region the cluster ran in](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions).
 
-<img src="https://user-images.githubusercontent.com/59929718/147899913-c1305da0-aab5-4882-8faa-3beeff710ec8.png" width="50%" height="50%">
+New EMR console             |  Old EMR Console
+:-------------------------:|:-------------------------:
+<img src="https://user-images.githubusercontent.com/4088105/223570783-1a729e33-e270-4e4b-82bc-e380fed764ef.png">  |  <img src="https://user-images.githubusercontent.com/4088105/223570400-ef23916f-dab5-465e-8ff5-ca1a57f137be.png">
 
-`<region>` represents [the region the cluster ran in](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions). The script doesn't rely on the region configured in AWS config to align with the region the cluster actually ran in. (e.g. us-east-1)
 
+## Step 2: Retrieve EMR Spark logs and upload into Autotuner
 
-## Step 2: Retrieve EMR Spark logs and upload into Autotuner step #2
+1.  Go to the EMR console in AWS, and find the cluster that ran the job you are interested in optimizing. Click on the cluster name to view details of the cluster.
 
-1.  Assure that you have spark.eventLog.enabled set to true for any jobs you are interested in optimizing. 
+2.  Verify that you have `spark.eventLog.enabled` set to true for any jobs you are interested in optimizing. The Sync Autotuner needs a Spark event log from a job run in order to provide optimized cluster configurations for the job.
 
-2.  Go to the EMR console in AWS, and find the cluster that ran the job you are interested in optimizing. Click on the cluster name to view details of the cluster.
-<img src="https://user-images.githubusercontent.com/59929718/147900986-44b68adf-8f7d-4fda-b84b-2c54f6015fc5.png" width="50%" height="50%">
+New EMR console             |  Old EMR Console
+:-------------------------:|:-------------------------:
+<img src="https://user-images.githubusercontent.com/4088105/223572670-4ee02e08-3a2e-4021-add6-185f645838fe.png">  |  <img src="https://user-images.githubusercontent.com/4088105/223572532-aa20eb49-a010-401f-a63b-cab035efea5a.png">
 
-3.  Once you are in the cluster information page, click on the “Application user interfaces” tab, and click on “Spark history server” (in red below) under “Persistent application user interfaces.”
-<img src="https://user-images.githubusercontent.com/59929718/147901007-81f08b39-1c20-468f-b57c-57dcfe4e46d5.png" width="50%" height="50%">
+3.  If `spark.eventLog.dir` is set and specifies an S3 location then download the Spark event log from the specified S3 location. Skip to Step 7.
 
+4.  If `spark.eventLog.dir` is **not set**, follow the steps below to download the Spark event log from the Spark history server.
 
-4.  A new tab should open up with the Spark history server. It may take a minute to load. Click the download button under the event log column to download the Spark event log. Upload this log into the Autotuner in step #2.
+5.  Once you are in the cluster information page, click on the “Application user interfaces” tab, and click on “Spark history server” (in red below) under “Persistent application user interfaces.”
+
+New EMR console             |  Old EMR Console
+:-------------------------:|:-------------------------:
+<img src="https://user-images.githubusercontent.com/4088105/223585554-df6b249d-10ca-41ef-b8a9-ff328a709a9f.png">  |  <img src="https://user-images.githubusercontent.com/4088105/223585649-8f32d7dd-e20d-49af-b307-dba292fcebcd.png">
+
+6.  A new tab should open up with the Spark history server. It may take a minute to load. Click the download button under the event log column to download the Spark event log.
 <img src="https://user-images.githubusercontent.com/59929718/147901014-2c111ad3-3a74-4786-971c-880e578c9257.png" width="50%" height="50%">
 
+7.  Upload the Spark event log into the Autotuner.
+<img src="https://user-images.githubusercontent.com/4088105/223587432-013fd96d-597a-49c0-969b-edf0e406706b.png" width="50%" height="50%">
+
+
 
 # Databricks Tools
 
@@ -118,4 +132,4 @@ Instructions for finding a cluster-id through the Databricks console can be foun
   ],
   "total_count": 22
 }    
-```
\ No newline at end of file
+```