Here's a pseudocode of the ECLAT algorithm:



The ECLAT (Equivalence Class Clustering and bottom-up Lattice Traversal) algorithm is used to identify frequent item sets in transactional data. Here is the pseudocode for the ECLAT algorithm:

Initialize an empty item set (L)

For each transaction in the transactional data:
  a. Get the items in the transaction and sort them in lexicographical order
  b. Add the items to the item set (L)

Initialize a dictionary (D) to store the frequent items and their support count

For each item in the item set (L):
  a. Count the number of transactions that contain the item
  b. If the support count of the item is greater than or equal to the minimum support threshold, add the item to the dictionary (D)

For each item in the dictionary (D), find its frequent item sets by:
a. Generating all possible combinations of items
b. Counting the number of transactions that contain each combination
c. If the support count of the combination is greater than or equal to the minimum support threshold, add the combination to the dictionary (D)

Return the frequent item sets in the dictionary (D)

In [None]:
def read_dataset(file_path):
    """
    This function reads the dataset from a text file. 
    Each line in the file represents a transaction, and the numbers in each line are separated by a space.
    The function returns a list of transactions where each transaction is represented as a set.

    Parameters:
    file_path (str): The path to the text file containing the dataset.

    Returns:
    list: A list of transactions, each represented as a set.
    """
    transactions = []
    with open(file_path, 'r') as f:
        for line in f:
            transaction = set(map(int, line.strip().split()))
            transactions.append(transaction)
    return transactions

In [None]:
def eclat(dataset, min_support=1):
    # Find all items in the dataset
    items = set()
    for transaction in dataset:
        for item in transaction:
            items.add(item)

    # Create a list of items and sort them by frequency
    item_list = list(items)
    item_list.sort(key=lambda x: sum([1 for transaction in dataset if x in transaction]), reverse=True)

    # Get frequent itemsets
    result = []
    for i in range(len(item_list)):
        eclat_recursive(dataset, [item_list[i]], i, item_list, min_support, result)

    # Return frequent itemsets
    return result

def eclat_recursive(dataset, prefix, prefix_index, item_list, min_support, result):
    # Find prefix support
    prefix_support = sum([1 for transaction in dataset if all(item in transaction for item in prefix)])

    # If prefix support is greater than or equal to minimum support
    if prefix_support >= min_support:
        # Add prefix to result
        result.append((prefix, prefix_support))

        # For each item after the prefix in the item list
        for i in range(prefix_index + 1, len(item_list)):
            # Find support for item
            item_support = sum([1 for transaction in dataset if item_list[i] in transaction])

            # If item support is greater than or equal to minimum support
            if item_support >= min_support:
                # Call eclat_recursive with item appended to prefix
                eclat_recursive(dataset, prefix + [item_list[i]], i, item_list, min_support, result)

#Example usage
# dataset = [
#     [1, 2, 3],
#     [1, 2, 4],
#     [2, 3, 4],
#     [2, 3],
#     [1, 2],
#     [1, 3],
#     [2, 4]
# ]



In [None]:
transactions = read_dataset('dataset.txt')

min_support = 4

# frequent_itemsets = eclat(dataset, min_support)
frequent_itemsets = eclat(transactions, min_support)

for frequent_itemset, support in frequent_itemsets:
    print("Item Set: {} Support: {}".format(frequent_itemset, support))


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Item Set: [19, 21, 74] Support: 4
Item Set: [19, 21, 120] Support: 4
Item Set: [19, 21, 53] Support: 4
Item Set: [19, 21, 97] Support: 4
Item Set: [19, 42] Support: 7
Item Set: [19, 62] Support: 18
Item Set: [19, 62, 112] Support: 4
Item Set: [19, 62, 120] Support: 6
Item Set: [19, 62, 6] Support: 4
Item Set: [19, 62, 88] Support: 5
Item Set: [19, 62, 118] Support: 4
Item Set: [19, 62, 22] Support: 6
Item Set: [19, 62, 17] Support: 4
Item Set: [19, 62, 7] Support: 5
Item Set: [19, 62, 93] Support: 4
Item Set: [19, 62, 76] Support: 5
Item Set: [19, 62, 35] Support: 4
Item Set: [19, 62, 109] Support: 4
Item Set: [19, 62, 111] Support: 4
Item Set: [19, 62, 20] Support: 4
Item Set: [19, 72] Support: 9
Item Set: [19, 74] Support: 17
Item Set: [19, 74, 97] Support: 4
Item Set: [19, 74, 2] Support: 5
Item Set: [19, 74, 5] Support: 4
Item Set: [19, 74, 29] Support: 4
Item Set: [19, 74, 59] Support: 4
Item Set: [19, 74, 93] Suppor