# Package Manager


## Introduction

You can use the PackageManager component of PixieDust to install and uninstall maven packages into your notebook kernel without editing configuration files. This component is essential when you run notebooks from a hosted cloud environment and do not have access to the configuration files.

You can use this component in the following ways:

* [List installed packages](#list)
* Install packages
 * [Install a spark package from spark-packages.org](#install_from_spark_packages)
 * [Install from maven search repository](#install_from_maven)
 * [Install a jar file directly from an addressable location](#install_from_generic)
* [Uninstall packages](#uninstall)


## Prerequisites

In [None]:
import pixiedust

## <a id="list"></a>List packages

To uninstall a package run `pixiedust.printAllPackages()`

In [None]:
#import pixiedust
pixiedust.printAllPackages()

## Install Packages

### <a id="install_from_spark_packages"></a>Install a spark package from spark-packages.org

* Go to [spark-packages.org](https://spark-packages.org/), and search for your package.

* Click the link for your package and locate the code to run the package in spark-shell, pyspark, or spark-submit. For example, you would retrieve the following line:

```shell
 > $SPARK_HOME/bin/spark-shell --packages graphframes:graphframes:0.1.0-spark1.6
```

* Copy the maven ID of the package. A maven ID includes a `groupId`, `artifactId`, and `version`, each separated by a colon (`:`). In the previous example, the maven ID appears after the packages, such as `graphframes:graphframes:0.1.0-spark1.6`.

To load a specific package version

In [None]:
#import pixiedust
pixiedust.installPackage("graphframes:graphframes:0.1.0-spark1.6")

> Specify version `0` to fetch the latest release: 
> `pixiedust.installPackage("graphframes:graphframes:0")`

Notice the line that instructs you to restart the kernel to complete installation of the new package. This is required only the first time. Restart the kernel by using the Kernel/Restart menu. After the kernel is restarted, the library is added to the classpath and can be used from your Python notebook.

> Some libraries, such as GraphFrames include a python module within it. PixieDust automatically adds the python file into the SparkContext. However, you must explicitly call pixiedust.installPackage at the beginning of every kernel session so that the python modules are added to the SparkContext.

### <a id="install_from_maven"></a>Install from maven search repository

* Go to the maven search site, [search.maven.org](http://search.maven.org), and look for the package of your choice, like `org.apache.commons`. 
* In the results page, open the link of the component you want, like `commons-proxy`. 
* Run the `installPackage` method and specify the *groupId*, *artifactId*, and *version* parameters: `pixiedust.installPackage("groupId:artifactId:version")`

In [None]:
# import pixiedust
pixiedust.installPackage("org.apache.commons:commons-proxy:1.0")

> Specify version `0` to fetch the latest release: 
> `pixiedust.installPackage("org.apache.commons:commons-proxy:0")`

By default, PixieDust automatically looks for the following two maven repositories: http://repo1.maven.org/maven2 and http://dl.bintray.com/spark-packages/maven. If you use a custom maven repository, you can specify it by using the following `base` keyword argument:

In [None]:
# import pixiedust
pixiedust.installPackage("org.apache.commons:commons-proxy:0", base="http://repo1.maven.org/maven2")

### <a id="install_from_generic"></a>Install a JAR file directly from an addressable location

To install a JAR file that is not packaged in a maven repository, provide the URL to the jar file. PixieDust will then bypass the maven look up and directly download the jar file from the specified location:

In [None]:
#import pixiedust
pixiedust.installPackage("https://github.com/ibm-cds-labs/spark.samples/raw/master/dist/streaming-twitter-assembly-1.6.jar")

## <a id="uninstall"></a>Uninstall packages

To uninstall a package run `pixiedust.uninstallPackage(<<packagename>>)`

In [None]:
#import pixiedust
pixiedust.uninstallPackage("graphframes:graphframes:0.1.0-spark1.6")

<hr>
Copyright &copy; IBM Corp. 2017. This notebook and its source code are released under the terms of the MIT License.