Skip to content

Data mining algorithm PrefixSpan based on Python/数据挖掘算法PrefixSpan的简单Python实现

Notifications You must be signed in to change notification settings

Holy-Shine/PrefixSpan-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

HitCount

PrefixSpan-py

An implementation of data mining algorithm PrefixSpan based on Python.

Readme中文版

Base on paper:

PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth

Quick Start

This simple python script does not rely on any other third-party libraries. Just confirm that your environment is Python 3.

  1. Type code below to import script.

    import PrefixSpan
  2. Pack data. Confirm that this data is a 2-D python list whose elements are single char(or int)

    data = [
        [1, 4, 2, 3],
        [0, 1, 2, 3],
        [1, 2, 1, 3, 4],
        [2, 1, 2, 4, 4],
        [1, 1, 1, 2, 3],
    ]
  3. Call function PrefixSpan.

    L = PrefixSpan.PrefixSpan(data,minSup=0.8,minConf=0)

    This function return a list that contains bunch of result subsequences(format: python tuple).

    Parameters:

    • minSup: min_support declared in paper, default 0.5
    • minConf: min_confidence, default 0
  4. Print result.

    print(L)

    [out]:

    [[(1,), (2,), (3,)], [(1, 2), (1, 3), (2, 3)], [(1, 2, 3)], []]

    The length of pattern sequence in result list increases by 1

About

Data mining algorithm PrefixSpan based on Python/数据挖掘算法PrefixSpan的简单Python实现

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages