# Tutorial for working with Altair AI Studio

In order to use this notebook you need to have the [rapidminer](https://github.com/rapidminer/python-rapidminer-beta) package installed in your current python environment. For installation instructions, consult the [README](https://github.com/rapidminer/python-rapidminer-beta#rapidminer-python-package---beta-version).


### Connect to local Studio instance

In order to connect set the `rm_home` variable to the installation directory of your AI Studio!

In [None]:
import rapidminer
import os
rm_home="<set it first please to the installation directory of Studio>"
# If you don't want to see the log messages of the operations, use rm_stdout=open(os.devnull,"w")
connector = rapidminer.Studio(rm_home, rm_stdout=None)

### Reading ExampleSets

In [None]:
df = connector.read_resource("//Samples/data/Iris")
print("The result is a pandas DataFrame:")
print(df.head())

The operation will launch an Altair AI Studio instance in the background. This could take a few seconds. If you need to read multiple entries, you can speed up the operation by passing multiple repository paths to the method:

In [None]:
iris, deals, golf = connector.read_resource(["//Samples/data/Iris", "//Samples/data/Deals", "//Samples/data/Golf"])
print("The result are pandas DataFrames")
print(iris.head(1))
print(deals.head(1))
print(golf.head(1))

If you have repository files outside of a repository, you can read them as well: 

In [None]:
# set the parameter to an existing .ioo file
df = connector.read_resource(rapidminer.File("C:\path\to\my\data.ioo"))

You can specify repository locations, with the `rapidminer.RepositoryLocation` objects as well. (But this is not necessary, since string parameters are treated as repository locations.) In this example, we also increased the loglevel so that fewer entries are logged to the console.

In [None]:
import logging
connector.logger.setLevel(logging.WARNING)
df = connector.read_resource(rapidminer.RepositoryLocation(name="//Samples/data/Iris"))

### Writing ExampleSets

In [None]:
import pandas
from sklearn.datasets import load_iris

sklearn_iris = load_iris()
iris = pandas.DataFrame(sklearn_iris["data"], columns=sklearn_iris["feature_names"])
iris["target"] = sklearn_iris["target"]

In [None]:
# set the parameter to the desired repository location
connector.write_resource(iris, "//Local Repository/data/Iris")

You can write multiple processes in the same method call as well:

In [None]:
from sklearn.datasets import load_wine
sklearn_wine = load_wine()
wine = pandas.DataFrame(sklearn_wine["data"], columns=sklearn_wine["feature_names"])
wine["target"] = sklearn_wine["target"]
# set the parameter to the desired repository locations
connector.write_resource([iris, wine], ["//Local Repository/data/Iris", "//Local Repository/data/Wine"])

As with reading resources, you can also write resources to regular files, outside any RapidMiner repository:

In [None]:
# set the parameter to the desired file
connector.write_resource(iris, rapidminer.File("C:\path\to\the\output\file.ioo"))

You can also save any python object, including pandas models, and use them later with the `read_resource` method:

In [None]:
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=0)
clf.fit(iris[sklearn_iris["feature_names"]], iris["target"])
# set the parameter to the desired repository location
connector.write_resource(clf, "//Local Repository/data/IrisModel")

### Running a RapidMiner process
You can simply start a process an get the results with one method:

In [None]:
normalized_iris = connector.run_process("//Samples/processes/02_Preprocessing/01_Normalization")

You can also define inputs, run only a single operator, define values for macros. For example:

In [None]:
import pandas
from sklearn.datasets import load_wine
sklearn_wine = load_wine()
wine = pandas.DataFrame(sklearn_wine["data"], columns=sklearn_wine["feature_names"])
wine["target"] = sklearn_wine["target"]
wine["correlated1"] = wine["alcohol"]*2
wine["correlated2"] = wine["alcohol"]+wine["magnesium"]

In [None]:
normalized_wine, original = connector.run_process("//Samples/processes/04_Attributes/01_RemoveCorrelatedFeatures", 
                                                  inputs=wine, 
                                                  operator="RemoveCorrelatedFeatures")