# Tutorial for working with RapidMiner AI Hub

In order to use this notebook you need to have the [rapidminer](https://github.com/rapidminer/python-rapidminer) package installed in your current Python environment. For installation instructions, consult the [README](https://github.com/rapidminer/python-rapidminer).


### Connect to RapidMiner AI Hub

In order to connect to RapidMiner AI Hub, provide an URL and your username. The package will first create a process and a webservice, which will serve your further requests. You will be asked for your password and a path to the process. If you want to fully automate the execution, you can provide both of these with additional parameters, besides other parameters as well. For more details, please consult the [documentation](https://github.com/rapidminer/python-rapidminer/blob/master/docs/api/Server.md).

In [None]:
import rapidminer
username="<set it first please to your user name>"
# You can also provide your password with `password="*****"` and a remote repository path with `processpath="/my/cusom/path"` parameters
connector = rapidminer.Server(url="http://localhost:8080", username=username)

The sections below explain how to use the AI Hub repository. For interacting with project, check out [Project examples](project_examples.ipynb).

### Reading ExampleSets

In [None]:
df = connector.read_resource("//Samples/data/Iris")
print("The result is a pandas DataFrame:")
print(df.head())

If you need to read multiple entries, you can pass mutliple paths to the method:

In [None]:
iris, deals, golf = connector.read_resource(["//Samples/data/Iris", "//Samples/data/Deals", "//Samples/data/Golf"])
print("The result are pandas DataFrames")
print(iris.head(1))
print(deals.head(1))
print(golf.head(1))

Bear in mind, that you don't have to specify the name of the repository (if you are not using the built-in *Samples* repository), simply define a path to your dataset:

In [None]:
# set the parameter to an existing ExampleSet
df = connector.read_resource("/home/myuser/data/Golf")

Reading an ExampleSet from a RapidMiner AI Hub project is also simple, let's assume that the project is called *Sample-dev*:

In [None]:
df_iris = connector.read_resource("data/Iris", project="sample-dev")

### Writing ExampleSets

Once you have your data ready in pandas, you can upload it to RapidMiner AI Hub with a single method call (see in the second cell):

In [None]:
import pandas
from sklearn.datasets import load_iris

sklearn_iris = load_iris()
iris = pandas.DataFrame(sklearn_iris["data"], columns=sklearn_iris["feature_names"])
iris["target"] = sklearn_iris["target"]

In [None]:
# set the parameter to the desired repository location
connector.write_resource(iris, "/home/" + username + "/iris")

You can write multiple processes in the same method call as well:

In [None]:
from sklearn.datasets import load_wine
sklearn_wine = load_wine()
wine = pandas.DataFrame(sklearn_wine["data"], columns=sklearn_wine["feature_names"])
wine["target"] = sklearn_wine["target"]
# set the parameter to the desired repository locations
connector.write_resource([iris, wine], ["/home/" + username + "/iris", "/home/" + username + "/wine"])

Writing an ExampleSet back to a versioned RapidMiner AI Hub project requires you to use the [Project](https://github.com/rapidminer/python-rapidminer/blob/master/docs/api/Project.md) class and a git client.

### Running a RapidMiner process
You can simply start a process an get the results with one method. It could take a few seconds, to get back the results as pandas DataFrames:

In [None]:
normalized_iris = connector.run_process("/home/" + username + "/process/normalize_data", [iris])

You can also define inputs, the queue to use and define values for macros. For example:

In [None]:
# set the parameters to the desired process, queue and macros
transformed_wine = connector.run_process("/home/" + username + "/transform_inputs", inputs=wine, queue="default", macros={"sample_size" : 100})

You can also run a process from a RapidMiner AI Hub project:

In [None]:
connector.run_process("processes/normalize_iris", project="sample-dev")

### Using Connections
Connections defined in the AI Hub repository are available using the following function:

In [None]:
connections = connector.get_connections()

Accessing the field values of these connections are possible through several ways, see examples below. Use these values to establish a connection to a database, cloud service, etc. using an appropriate Python package (e.g. _sqlalchemy_).

In [None]:
from sqlalchemy import create_engine
connection = connections["sample-postgres"]
postgres_str = ('postgresql://{username}:{password}@{host}:{port}/{dbname}'
    .format(username=conn.user,
        password=conn.password,
        host=conn.values["host"],
        port=conn.values["port"],
        dbname=db_name
    )
)
cnx = create_engine(postgres_str).raw_connection()
pandas.read_sql_query("SELECT * FROM test_date_types_1", con=cnx)