# Presto Installation

In order to use Presto, there are a few components that we need to download and install. This notebook will show you how to do that.

## Steps to Install Presto on a local machine

From a high-level, below are the required steps to get Presto up and running on your local machine:

#### 1. Setup the Presto Server
- Download and install the Presto server tarball
- Create the required directories
- Create and configure 3 property files:
    - `nodes.properties`
    - `jvm.config`
    - `config.properties`

#### 2. Setup the Presto Client
- Download and install the Presto client
- Prepare the file for use
- Modify the tool's access
- Install Python (if it's not already configured)
- Start the Presto server and client

#### 1. Download the Presto Server tarball

Next, we need to download the Presto server file by running the below command:

In [None]:
sudo wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.272/presto-server-0.272.tar.gz

#### 2. Unpack the tarball to extract the files

After the file is downloaded, we need to unpack it. To do this, run the below command:

In [None]:
sudo tar -xzvf presto-server-0.272.tar.gz -C /usr/local/

cd /usr/local

# Rename the directory to presto
sudo mv presto-server-0.272 presto

#### 3. Delete the tarball

After all files are unpacked successfully, we'll delete the tarball to save disk space. Change directory to the folder the tar was stored in and run the below command:

In [None]:
rm -r presto-server-0.272.tar.gz

#### 4. Create a Presto `data` directory

Presto requires a `prestodata` directory to store log files. As a best practice, it's recommended to create this directory _outside_ of the installation folder in order to maintain this information even if Presto is deleted or upgraded etc. 

Let's go ahead and create this directory as follows:

In [None]:
sudo mkdir /usr/local/prestodata

#### 5. Create an `etc` directory

This will be used to store all the required configuration files. We'll need to create the following configuration files:
- `node.properties`
    - Stores the environmental configurations that are specific to each node
- `jvm.config`
    - Sets the command line options for the Java virtual machine (JVM)
- `config.properties`
    - Sets the configurations for the Presto server itself

We can create it inside the main Presto installation folder as follows:

In [None]:
sudo mkdir /usr/local/presto/etc

#### 6. Create the `node.properties` file

This file contains the configurations that are specific to each node. A _node_ is a single installed instance of Presto on a machine.

Let's go ahead and create this file:

In [None]:
sudo nano /usr/local/presto/etc/node.properties

Once the file is created, add the following then save the file:

In [None]:
node.environment=production
## can say to generate a unique Id here and replace this one
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff # random uuid
node.data-dir=/usr/local/prestodata

Below is a brief description of what each of these properties does:

`node.environment` 
- The name of the environment
- All Presto nodes in a cluster must have the _same_ environment name

`node.id` 
- The unique identifier for this Presto node
- This ID must be unique for every node

`node.data-dir` 
- The location the Presto data directory
- Presto will store logs and other data here

#### 7. Create the `jvm.config` file

Our next task is to create the Java virtual machine (JVM) configuration file. This file contains a list of command line options used to specify the parameters for the JVM.

Let's create the file by running the below command:

In [None]:
sudo nano /usr/local/presto/etc/jvm.config

Once the file is created, add the following settings and save the file:

In [None]:
-server
-Xmx16G 
-XX:+UseG1GC  
-XX:G1HeapRegionSize=32M 
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError 
-XX:+ExitOnOutOfMemoryError 

#### 8. Create the `config.properties` file

This file contains the configuration information for the Presto server. 

_Note: Every Presto server can function as both a coordinator and a worker. In large enterprise clusters, it's recommended to use one machine as the coordinator and another machine as the worker as this enhances the performance of the overall system. Howerver, for testing and training purposes, both roles can be configured on the same node._

Let's go ahead and create the file by running the below command:

In [None]:
sudo nano /usr/local/presto/etc/config.properties

Once the file is created, add the below settings then save the file:

In [None]:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://localhost:8080

Below is a brief explanation of each of these properties:

`coordinator`
- Sets the current Presto instance to operate as a coordinator (accept queries from clients and manage query execution)

`node-scheduler.include-coordinator`
- Whether or not to enable scheduling work on this coordinator
- For larger clusters, processing work on the coordinator can impact query performance

`http-server.http.port`
- Specifies the port to use for the HTTP server
- Presto uses HTTP for all communication, internal and external.

`query.max-memory`
- The maximum total amount of distributed memory that a query can use

`query.max-memory-per-node`
- The maximum amount of user memory a query can use on any one machine

`query.max-total-memory-per-node`
- The maximum amount of user and system memory that a query may use on any one machine
- _System memory_ is the memory used during execution by readers, writers, and network buffers, etc.

`discovery-server.enabled`
- Presto uses the Discovery service to find all available nodes in a cluster
- Every Presto instance will register itself with the Discovery service on startup

`discovery.uri`
- The URI to the Discovery server
- This should be the URI of the Presto coordinator

This completes the required configurations for the Preso server. Next, we'll setup the Presto client.

## Setup the Presto Client

#### 1. Download and Install the Presto Client

Now that the server is up and running, the next group of steps will focus on downloading and installing the Presto client. We'll be using the client as the interface to connect to the server.

First, run the following command to download the Presto client file:

In [None]:
sudo wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.270/presto-cli-0.270-executable.jar

#### 2. Move the `.jar` file to the Presto Server `bin` folder and rename it to `presto`. This will allow you to execute the file as a program to interact with the Presto CLI.

Run the below command:

In [None]:
sudo mv presto-cli-0.270-executable.jar /usr/local/presto/bin/presto

#### 3. Grant execution access to the CLI   

To be able to run the Presto CLI, we need to change the access right to make it executable. Run the below command:

In [None]:
sudo chmod +x presto

#### 4. Run the Presto Server

Go one level up in the folder hierarchy and launch the Presto server. 

To do this, run the below commands:

In [None]:
# Go one level up in the folder hierarchy
cd ..

# Launch the Presto Server
## We should add in here first to run the server with /bin/launcher run cause it should the log to the server starting up and will show any errors with the server which can then be fixed.
# Doesn't show this if you just start it as a service. 
bin/launcher run

## If the server returns no errors then you can start it as a process with
/bin/launcher stop
/bin/launcher start

If everything runs successfully, you should see something similar to this output:

<p align="center">
  <img src="images/launcher-success.png" width=600>
</p>

_Note: One common error you may encounter is `/usr/bin/env: ‘python’: No such file or directory`_

<p align="center">
  <img src="images/launcher-error.png" width=600>
</p>

To resolve this error, check that:
- Python is correctly installed 
- The Python folder path is correct

To do this, run the below commands:

In [None]:
# Install Python3
sudo apt-get install python3

If Python is already installed, it will give you a message similar to this one:
<p align="center">
  <img src="images/python-installed.png" width=600>
</p>

Next, rename the Python folder by running the below command:

In [None]:
# Rename the Python3 folder to Python
sudo ln -s /usr/bin/python3 /usr/bin/python

#### 5. Check the Presto UI

To double check that the Presto server is running correctly, we can connect to the URL and port which we configured earlier. This will open the cluster overview page.

To do this, open a web browser (such as Firefox) and enter the below:

In [None]:
127.0.0.1:8080

Assuming all is well, you should see the following:
<p align="center">
  <img src="images/presto-gui.png" width=600>
</p>

#### 6. Run the Client

Next, we'll run the client and connect it to the server by running the below command:

In [None]:
bin/presto --server 127.0.0.1:8080

Assuming all goes well, you should now see the Presto client shell as below:

<p align="center">
  <img src="images/presto-client.png" width=600>
</p>

That completes the Presto environment setup. The Presto client and server are now connected and ready to go!

The next step will be integrating Presto with whichever tool that has the required data stored. This integration will be covered in a seperate notebook.