<a href="https://colab.research.google.com/github/UdayLab/PAMI/blob/main/notebooks/frequentPattern/basic/ECLATDiffset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Discovering Frequent Patterns in Big Data Using ECLATDiffset Algorithm

In this tutorial, we will discuss the first approach to find frequent patterns in big data using ECLATDiffset algorithm.

[__Basic approach:__](#basicApproach) Here, we present the steps to discover frequent patterns using a single minimum support value

***

## <a id='basicApproach'>Basic approach: Executing ECLATDiffset on a single dataset at a particular minimum support value</a>

#### Step 0: install pami repository

In [1]:
!pip install -U pami

Collecting pami
  Downloading pami-2023.7.28.5-py3-none-any.whl (796 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m796.6/796.6 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
Collecting resource (from pami)
  Downloading Resource-0.2.1-py2.py3-none-any.whl (25 kB)
Collecting validators (from pami)
  Downloading validators-0.20.0.tar.gz (30 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting JsonForm>=0.0.2 (from resource->pami)
  Downloading JsonForm-0.0.2.tar.gz (2.4 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting JsonSir>=0.0.2 (from resource->pami)
  Downloading JsonSir-0.0.2.tar.gz (2.2 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting python-easyconfig>=0.1.0 (from resource->pami)
  Downloading Python_EasyConfig-0.1.7-py2.py3-none-any.whl (5.4 kB)
Building wheels for collected packages: validators, JsonForm, JsonSir
  Building wheel for validators (setup.py) ... [?25l[?25hdone
  Created wheel for validators

#### Step 1: Import the ECLATDiffset algorithm

In [2]:
from PAMI.frequentPattern.basic import ECLATDiffset  as alg

#### Step 2: Specify the following input parameters

In [3]:
inputFile = 'https://u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/Transactional_T10I4D100K.csv'

minimumSupportCount = 1000  #Users can also specify this constraint between 0 to 1.

seperator='\t'

#### Step 3: Execute the ECLATDiffset algorithm

In [None]:
obj = alg.ECLATDiffset(iFile=inputFile, minSup=minimumSupportCount, sep=seperator)    #initialize
obj.startMine()            #Start the mining process

#### Step 4: Storing the generated patterns

##### Step 4.1: Storing the generated patterns in a file

In [None]:
obj.save(outFile='frequentPatternsMinSupCount1000.txt')

##### Step 4.2. Storing the generated patterns in a data frame

In [None]:
frequentPatternsDF= obj.getPatternsAsDataFrame()

#### Step 5: Getting the statistics

##### Step 5.1: Total number of discovered patterns

In [None]:
print('Total No of patterns: ' + str(len(frequentPatternsDF)))

##### Step 5.2: Runtime consumed by the mining algorithm

In [None]:
print('Runtime: ' + str(obj.getRuntime()))

##### Step 5.3: Total Memory consumed by the mining algorithm

In [None]:
print('Memory (RSS): ' + str(obj.getMemoryRSS()))
print('Memory (USS): ' + str(obj.getMemoryUSS()))