# Association Rule Mining using EasyMiner API
This example demonstrates the possibility of association rule mining using complex REST API of data mining system EasyMiner.
<br /><br />
To use this example, you must have a working instance of EasyMiner. For testing purposes, you can use our demo server.

## Dataset IRIS
This example code is based on daset IRIS from the [UCI Repository](https://archive.ics.uci.edu/ml/datasets/iris). The file used in this exmample is located in the folder with this notebook: [iris.csv](./iris.csv)
<br /><br />
The dataset contains columns *sepallength*, *petalwidth*, *sepalwidth*, *petallength* and *class*. For rule miming also as for classification model building, the column *class* should be used in consequent part of rules, other columns should be used in antecedent.

## 1. Setup variables, import dependencies
To run this example, you have to configure the following variables:

In [None]:
# Import requested libraries
import requests
import json
import time
import urllib

# Configure access variables
API_URL = 'https://br-dev.lmcloud.vse.cz/easyminercenter/api'
API_KEY = 'mfC87Pp75zn018098owhU089J461260m75AtGGkybwDEtiTqejE'

# Setup details about the used file
CSV_FILE = 'iris.csv'
CSV_SEPARATOR = ','
CSV_ENCODING = 'utf8'

## 2. Upload CSV file to EasyMiner server

In [None]:
# HTTP request for uploading of the CSV file
r = requests.post(API_URL + '/datasources?separator=' + urllib.parse.quote(CSV_SEPARATOR) + '&encoding=' + CSV_ENCODING + '&type=limited&apiKey=' + API_KEY, files = {("file", open(CSV_FILE, 'rb'))}, headers = {"Accept": "application/json"})

# Get datasource ID (identificates the dataset on EasyMiner server) from the server response
datasource_id = r.json()["id"]

# For debug purposes, print datasource_id - if the datasource was created successfully, the datasource_id should be greater than )  
print('Created datasource ID: '+str(datasource_id))


## 3. Create Miner

In [None]:
# Define name for the miner {optional value for your better orientation in list of miners]
miner_name = 'TEST MINER'

# JSON configuration of the API request (will be sent as body of the HTTP request)
json_data = json.dumps({"name": miner_name, "type": "cloud", "datasourceId": datasource_id})

# Send request for miner creation
r = requests.post(API_URL + "/miners?apiKey=" + API_KEY, headers = {'Content-Type': 'application/json', "Accept": "application/json"}, data = json_data.encode())

# Get ID of the created miner (identificates the miner on EasyMiner server)
miner_id = r.json()["id"]

# For debug purposes, print datasource_id - if the datasource was created successfully, the datasource_id should be greater than )  
print('Created miner ID: '+str(miner_id))


## 4. Preprocess data 
It is not possible to use the uploaded data fields from the uploaded datasource directly for definition of the data mining task. You have to generate attribute from each attribute you want to use.
<br /><br />
The simplest preprocessing method is to use the values of the data field "as they are" using the preprocessing method "each value - one bin".
<br /><br />
The uploaded data fields are identified using their names. Remember, the names has not be exactly the same as in the uploaded file (in case of duplicities etc.). You should get the list of data fields (columns) in the datasource:

In [None]:
# Request from the EasyMiner list of columns (data fields) available in the existing datasource
r = requests.get(API_URL + '/datasources/' + str(datasource_id) + '?apiKey=' + API_KEY, headers = {'Content-Type': 'application/json', "Accept": "application/json"})

# The response contains properties of the datasource also as the list of columns. Get only the columns... 
datasource_columns = r.json()['column']


### Construction of preprocessing requests
# In case you want to preprocess all the columns from the data field using the method "each value - one bin", you can simple use the following code

In [None]:
for col in datasource_columns:
    # You can work with the column name or the column ID. Both these values are parsed from the previous JSON response.
    column_name = col['name']
    
    # You have to select 
    attribute_name = column_name
    
    # Construct the definition of preprocessing request; 
    # for identification of the column from datasource, you can use its ID (set it to property "column"), or its name (set it to property "columnName").. 
    json_data = json.dumps({"miner": miner_id, "name": attribute_name, "columnName": column_name, "specialPreprocessing": "eachOne"})
    
    # Send the request and wait for the response;
    # dependently on the size of the used datasource, it can take a bit longer time...
    r = requests.post(API_URL + "/attributes?apiKey=" + API_KEY, headers = {'Content-Type': 'application/json', "Accept": "application/json"}, data = json_data.encode())
    if r.status_code != 201:
        break  # error occurred - the preprocessing of the selected attribute failed 
