# Discovering Frequent Patterns in Big Data Using ECLAT Algorithm

In this tutorial, we will discuss the first approaches to find frequent patterns in big data using ECLAT algorithm.

[__Basic approach:__](#basicApproach) Here, we present the steps to discover frequent patterns using a single minimum support value

***

## <a id='basicApproach'>Basic approach: Executing ECLAT on a single dataset at a particular minimum support value</a>

#### Step 0: install pami repository

In [1]:
!pip install -U pami



#### Step 1: Import the ECLAT algorithm

In [2]:
from PAMI.frequentPattern.basic import ECLAT  as alg

SyntaxError: invalid syntax (ECLAT.py, line 236)

#### Step 2: Specify the following input parameters

In [None]:
inputFile = 'https://u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/Transactional_T10I4D100K.csv'

minimumSupportCount = 1000  #Users can also specify this constraint between 0 to 1.

seperator='\t'       

#### Step 3: Execute the ECLAT algorithm

In [None]:
obj = alg.ECLAT(iFile=inputFile, minSup=minimumSupportCount, sep=seperator)    #initialize
obj.startMine()            #Start the mining process

#### Step 4: Storing the generated patterns

##### Step 4.1: Storing the generated patterns in a file

In [None]:
obj.save(outFile='frequentPatternsMinSupCount100.txt')

##### Step 4.2. Storing the generated patterns in a data frame

In [None]:
frequentPatternsDF= obj.getPatternsAsDataFrame()

#### Step 5: Getting the statistics

##### Step 5.1: Total number of discovered patterns 

In [None]:
print('Total No of patterns: ' + str(len(frequentPatternsDF)))

##### Step 5.2: Runtime consumed by the mining algorithm

In [None]:
print('Runtime: ' + str(obj.getRuntime()))

##### Step 5.3: Total Memory consumed by the mining algorithm

In [None]:
print('Memory (RSS): ' + str(obj.getMemoryRSS()))
print('Memory (USS): ' + str(obj.getMemoryUSS()))

# Advanced Tutorial on Implementing ECLAT Algorithm

In this tutorial, we will discuss the second approach to find frequent patterns in big data using ECLAT algorithm.

[__Advanced approach:__](#advApproach) Here, we generalize the basic approach by presenting the steps to discover frequent patterns using multiple minimum support values.

***

#### In this tutorial, we explain how the ECLAT algorithm  can be implemented by varying the minimum support values

#### Step 1: Import the ECLAT algorithm and pandas data frame

In [None]:
from PAMI.frequentPattern.basic import ECLAT  as alg
import pandas as pd

#### Step 2: Specify the following input parameters

In [None]:
inputFile = '/userData/likhitha/new/frequentPattern/transactional_T10I4D100K.csv'
seperator='\t'
minimumSupportCountList = [100, 150, 200, 250, 300]
#minimumSupport can also specified between 0 to 1. E.g., minSupList = [0.005, 0.006, 0.007, 0.008, 0.009]

result = pd.DataFrame(columns=['algorithm', 'minSup', 'patterns', 'runtime', 'memory'])
#initialize a data frame to store the results of ECLAT algorithm

#### Step 3: Execute the ECLAT algorithm using a for loop

In [None]:
algorithm = 'ECLAT'  #specify the algorithm name
for minSupCount in minimumSupportCountList:
    obj = alg.ECLAT(inputFile, minSup=minSupCount, sep=seperator)
    obj.startMine()
    #store the results in the data frame
    result.loc[result.shape[0]] = [algorithm, minSupCount, len(obj.getPatterns()), obj.getRuntime(), obj.getMemoryRSS()]


In [None]:
print(result)

#### Step 5: Visualizing the results

##### Step 5.1 Importing the plot library

In [None]:
from PAMI.extras.graph import plotLineGraphsFromDataFrame as plt

##### Step 5.2. Plotting the number of patterns

In [None]:
ab = plt.plotGraphsFromDataFrame(result)
ab.plotGraphsFromDataFrame() #drawPlots()

### Step 6: Saving the results as latex files

In [None]:
from PAMI.extras.graph import generateLatexFileFromDataFrame as gdf
gdf.generateLatexCode(result)