# Working with an existing remote Spark via HTTP (sample 1)  

IBM Watson Studio provides the interface for Python notebooks to work with an existing remote Spark through HTTP connection and user-friendly sparkmagic commands. This sample notebook shows how to send a simple request to remote Spark.

The installation of the remote Spark in this sample is using Horton Data Platform (HDP), which utilizes Livy HTTP REST API. Livy is an open source REST interface for interacting with [Apache Spark](http://spark.apache.org) from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in [Apache Hadoop YARN](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html).

This notebook runs on Python 2.

  ## Table of contents 

   1.  [Load sparkmagic](#load-sparkmagic)<br>
   2.  [Create a connection to remote Spark](#connection-to-remote-spark)<br>
   3.  [Send a request to remote Spark](#send-request-remote-spark)<br>
   4.  [Delete the remote Spark session](#delete-session)<br>        

<a id="load-sparkmagic"></a>
## 1. Load sparkmagic

Sparkmagic is a set of tools for interactively working with remote Spark clusters through Livy, a Spark REST server, in Jupyter notebooks. The Sparkmagic project includes a set of magics for interactively running Spark code in multiple languages, as well as some kernels that you can use to turn Jupyter into an integrated Spark environment.


In [1]:
%load_ext sparkmagic.magics
import dsx_core_utils
dsx_core_utils.setup_livy_sparkmagic()
%reload_ext sparkmagic.magics

11/02/2017 12:51:12 AM - proxy_util - INFO - Set custom headers.
11/02/2017 12:51:12 AM - proxy_util - INFO - Set proxy user to be useed with Livy.
11/02/2017 12:51:12 AM - proxy_util - INFO - proxy settings set.


<a id="connection-to-remote-spark"></a>
##  2. Create a connection to remote Spark 

Run the following cell to invoke the user interface for managing Spark. In the user interface, perform the following tasks to create a connection to the remote Spark:
 * Check **Manage Endpoints**. If you already see an endpoint defined, then your Watson Studio Admin has configured a default Watson Studio Endpoint.
 * Otherwise, select the **Add Endpoint** tab to create the endpoint of the Livy service URL. Type the Livy service URL in the **Address** field, select the authentication type, and specify the authentication credentials if required. Then, select the **Add endpoint** button.
 * Select the **Add Session** tab to create a session. Choose the endpoint, type the session name, and choose the language. Then, select the **Create Session** button. 

In [2]:
%manage_spark

Starting Spark application


ID,YARN Application ID,Kind,State,Spark UI,Driver log,Current session?
140,application_1509308217126_0034,pyspark,idle,Link,Link,✔


SparkContext available as 'sc'.
HiveContext available as 'sqlContext'.


<a id="send-request-remote-spark"></a>
## 3. Send a request to remote Spark

Run the following cell to send Python Spark codes that calculate and return the Pi number.

In [3]:
%%spark 
import random
NUM_SAMPLES = 100000

def sample(p):
    x, y = random.random(), random.random()
    return 1 if x*x + y*y < 1 else 0

count = sc.parallelize(xrange(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)
print "Pi is roughly %f" % (4.0 * count / NUM_SAMPLES)

Pi is roughly 3.145640

<a id="delete-session"></a>
## 4. Delete the remote Spark session

On the **Manage Sessions** tab in step 2, select the **Delete** button to delete the session.

## Summary

In this notebook, you learned how to send a simple request to remote Spark by using the Livy HTTP REST API.

<div class="alert alert-block alert-info"> Note: To save resources and get the best performance please use the code below to stop the kernel before exiting your notebook.</div>

In [None]:
%%javascript
Jupyter.notebook.session.delete();

<hr>
Copyright &copy; IBM Corp. 2017. Released as licensed Sample Materials.