<a href="https://colab.research.google.com/github/arjunpogaku/BOOK_Hands-on-Pattern-Mining/blob/main/chapter1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Chapter 1: Getting Started with PAMI: Introduction, Maintenance, and Usage**

## Installing PAMI Package

In [1]:
!pip install PAMI

Collecting PAMI
  Downloading pami-2024.12.6.1-py3-none-any.whl.metadata (80 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m80.3/80.3 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
Collecting resource (from PAMI)
  Downloading Resource-0.2.1-py2.py3-none-any.whl.metadata (478 bytes)
Collecting validators (from PAMI)
  Downloading validators-0.34.0-py3-none-any.whl.metadata (3.8 kB)
Collecting sphinx-rtd-theme (from PAMI)
  Downloading sphinx_rtd_theme-3.0.2-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting discord.py (from PAMI)
  Downloading discord.py-2.4.0-py3-none-any.whl.metadata (6.9 kB)
Collecting fastparquet (from PAMI)
  Downloading fastparquet-2024.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.2 kB)
Collecting cramjam>=2.3 (from fastparquet->PAMI)
  Downloading cramjam-2.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Collecting JsonForm>=0.0.2 (from resource->PAMI)
  Downloading JsonForm-0.0.2

## Download the dataset

In [2]:
!wget -nc https://web-ext.u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/Transactional_T10I4D100K.csv

--2024-12-06 17:17:16--  https://web-ext.u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/Transactional_T10I4D100K.csv
Resolving web-ext.u-aizu.ac.jp (web-ext.u-aizu.ac.jp)... 163.143.103.34
Connecting to web-ext.u-aizu.ac.jp (web-ext.u-aizu.ac.jp)|163.143.103.34|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4019277 (3.8M) [text/csv]
Saving to: ‘Transactional_T10I4D100K.csv’


2024-12-06 17:17:17 (5.67 MB/s) - ‘Transactional_T10I4D100K.csv’ saved [4019277/4019277]



## Implementing a pattern mining algorithm

### Syntax



```python
from PAMI.theoreticalModel.patternType import algorithm as alg
# Initialization
obj = alg.algorithm(inputFile, constraints, sep='\t')

# Mining the patterns
obj.mine()

# Save the discovered patterns
obj.save(outputFileName)

# Print the results
print('Total number of patterns: ' + \
       str(len(obj.getPatterns())))  
print('Runtime: ' + str(obj.getRuntime()))  
print('Memory (RSS): ' + str(obj.getMemoryRSS()))
print('Memory (USS): ' + str(obj.getMemoryUSS()))
```



### Example 1: Executing the Apriori algorithm

In [3]:
import PAMI.frequentPattern.basic.Apriori as alg

# Create an Apriori object
obj = alg.Apriori(iFile = 'Transactional_T10I4D100K.csv',
    minSup = 500)
# Run the mining process
obj.mine()
# Save the frequent patterns to an output file
obj.save(oFile = 'patterns.txt')
# Print the results
print('Total number of patterns: ' +
       str(len(obj.getPatterns())))
print('Runtime: ' + str(obj.getRuntime()))
print('Memory (RSS): ' + str(obj.getMemoryRSS()))
print('Memory (USS): ' + str(obj.getMemoryUSS()))

Frequent patterns were generated successfully using Apriori algorithm 
Total number of patterns: 1072
Runtime: 6.481090784072876
Memory (RSS): 372334592
Memory (USS): 349745152


## Evaluating Multiple Pattern Mining Algorithms

### Syntax

```python
from PAMI.frequentPattern.basic import Apriori as alg1
from PAMI.frequentPattern.basic import FPGrowth  as alg2
import pandas as pd

minimumSupportCountList = [1000, 1500, 2000, 2500, 3000]

resultDF = pd.DataFrame(columns=['algorithm', 'minSup','patterns', 'runtime', 'memoryRSS', 'memoryUSS'])


for minSupCount in minimumSupportCountList:
    obj = alg1.Apriori(iFile='Transactional_T10I4D100K.csv', minSup=minSupCount,sep='\t')
    obj.mine()
    resultDF.loc[resultDF.shape[0]]=['Apriori', minSupCount,len(obj.getPatterns()), obj.getRuntime(), obj.getMemoryRSS(), obj.getMemoryUSS()]

for minSupCount in minimumSupportCountList:
    obj = alg2.FPGrowth(iFile='Transactional_T10I4D100K.csv', minSup=minSupCount, sep='\t')
    obj.mine()
    resultDF.loc[resultDF.shape[0]]=['FPgrowth', minSupCount,len(obj.getPatterns()), obj.getRuntime(),obj.getMemoryRSS(), obj.getMemoryUSS()]

resultDF #print dataframe
```



### Example 2: Evaluating Apriori and FP-growth algorithms

In [None]:
from PAMI.frequentPattern.basic import Apriori as alg1
from PAMI.frequentPattern.basic import FPGrowth  as alg2
from PAMI.extras.graph import results2Latex as res
import pandas as pd

minimumSupportCountList = [1000, 1500, 2000, 2500, 3000]

resultDF = pd.DataFrame(columns=['algorithm', 'minSup','patterns', 'runtime', 'memoryRSS', 'memoryUSS'])


for minSupCount in minimumSupportCountList:
    obj = alg1.Apriori(iFile='Transactional_T10I4D100K.csv', minSup=minSupCount,sep='\t')
    obj.mine()
    resultDF.loc[resultDF.shape[0]]=['Apriori', minSupCount,len(obj.getPatterns()), obj.getRuntime(), obj.getMemoryRSS(), obj.getMemoryUSS()]

for minSupCount in minimumSupportCountList:
    obj = alg2.FPGrowth(iFile='Transactional_T10I4D100K.csv', minSup=minSupCount, sep='\t')
    obj.mine()
    resultDF.loc[resultDF.shape[0]]=['FPgrowth', minSupCount,len(obj.getPatterns()), obj.getRuntime(),obj.getMemoryRSS(), obj.getMemoryUSS()]

resultDF #print dataframe

Frequent patterns were generated successfully using Apriori algorithm 
Frequent patterns were generated successfully using Apriori algorithm 
Frequent patterns were generated successfully using Apriori algorithm 
Frequent patterns were generated successfully using Apriori algorithm 
Frequent patterns were generated successfully using Apriori algorithm 
Frequent patterns were generated successfully using frequentPatternGrowth algorithm
Frequent patterns were generated successfully using frequentPatternGrowth algorithm
Frequent patterns were generated successfully using frequentPatternGrowth algorithm
Frequent patterns were generated successfully using frequentPatternGrowth algorithm
Frequent patterns were generated successfully using frequentPatternGrowth algorithm


Unnamed: 0,algorithm,minSup,patterns,runtime,memoryRSS,memoryUSS
0,Apriori,1000,385,6.054968,412635136,390168576
1,Apriori,1500,237,4.813618,380657664,358240256
2,Apriori,2000,155,2.514425,365977600,343453696
3,Apriori,2500,107,2.057492,357425152,335097856
4,Apriori,3000,60,1.414784,378249216,355840000
5,FPgrowth,1000,385,8.031212,497770496,475054080
6,FPgrowth,1500,237,5.059398,441667584,419213312
7,FPgrowth,2000,155,5.088298,420995072,398401536
8,FPgrowth,2500,107,3.357919,412327936,389787648
9,FPgrowth,3000,60,1.755309,404918272,382201856


## Plotting the results

### Syntax



```python
from PAMI.extras.graph import plotGraphsFromDataFrame as dif
# Pass the result data frame to the class
ab = dif.plotGraphsFromDataFrame(resultDF)
# Plotting the graphs
obj.plot(result=resultDF, xaxis='constraint', yaxis='patterns', label='algorithm')
obj.plot(result=resultDF, xaxis='constraint', yaxis='runtime', label='algorithm')
obj.plot(result=resultDF, xaxis='constraint', yaxis='memoryRSS', label='algorithm')
obj.plot(result=resultDF, xaxis='constraint', yaxis='memoryUSS', label='algorithm')
#saving the graphs' results
obj.save(result=resultDF, xaxis='constraint', yaxis='patterns', label='algorithm',oFile='patterns.jpg')
obj.save(result=resultDF, xaxis='constraint', yaxis='runtime', label='algorithm',oFile='runtime.jpg')
obj.save(result=resultDF, xaxis='constraint', yaxis='memoryRSS', label='algorithm',oFile='memoryRSS.jpg')
obj.save(result=resultDF, xaxis='constraint', yaxis='memoryUSS', label='algorithm',oFile='memoryUSS.jpg')
```



### Example 3: Displaying and Saving the Evaluation Results

In [14]:
from PAMI.extras.graph import plotLineGraphsFromDataFrame as dif
# Pass the result data frame to the class
ab = dif.plotLineGraphsFromDataFrame(resultDF)
# Draw the graphs
obj.plot(result=resultDF, xaxis='minSup', yaxis='patterns', label='algorithm')
obj.plot(result=resultDF, xaxis='minSup', yaxis='runtime', label='algorithm')
obj.plot(result=resultDF, xaxis='minSup', yaxis='memoryRSS', label='algorithm')
obj.plot(result=resultDF, xaxis='minSup', yaxis='memoryUSS', label='algorithm')
#saving the graphs' results
obj.save(result=resultDF, xaxis='minSup', yaxis='patterns', label='algorithm',oFile='patterns.jpg')
obj.save(result=resultDF, xaxis='minSup', yaxis='runtime', label='algorithm',oFile='runtime.jpg')
obj.save(result=resultDF, xaxis='minSup', yaxis='memoryRSS', label='algorithm',oFile='memoryRSS.jpg')
obj.save(result=resultDF, xaxis='minSup', yaxis='memoryUSS', label='algorithm',oFile='memoryUSS.jpg')

AttributeError: 'Apriori' object has no attribute 'plot'

## Creating the Latex file containing the results

### Syntax



```python
from PAMI.extras.graph import results2Latex as alg
#Initailize
obj = alg.results2Latex()

#Printing the latex code
obj.print(result=resultDF,xaxis='xLabel',yaxis='yLabel',\
    label='algorithm')  
#Saving the latex code in a file
obj.save(result=resultDF,xaxis='xLabel',yaxis='yLabel',\
    label='algorithm',oFile='outputFileName.txt')
```



### Example 4: Storing Comparision results in Latex files

In [15]:
from PAMI.extras.graph import results2Latex as res

obj = res.results2Latex()
#Printing the latex code on the terminal
obj.print(result=resultDF, xaxis='minSup', yaxis='patterns',label='algorithm')
obj.print(result=resultDF, xaxis='minSup', yaxis='runtime', label='algorithm')
obj.print(result=resultDF, xaxis='minSup', yaxis='memoryRSS',label='algorithm')
obj.print(result=resultDF, xaxis='minSup', yaxis='memoryUSS', label='algorithm')
#save the latex code in a file
obj.save(result=resultDF, xaxis='minSup', yaxis='patterns', label='algorithm', oFile='patterns.txt')
obj.save(result=resultDF, xaxis='minSup', yaxis='runtime', label='algorithm', oFile='runtime.txt')
obj.save(result=resultDF, xaxis='minSup', yaxis='memoryRSS', label='algorithm', oFile='memoryRSS.txt')
obj.save(result=resultDF, xaxis='minSup', yaxis='memoryUSS', label='algorithm', oFile='memoryUSS.txt')

ValueError: The input DataFrame is empty. Please provide a DataFrame with data.