# Type1_3way matching invoice after goods receipt Filtering

This note is prepared to filter out INCOMPLE TRACES. The below criterias are considered to identify them.  

#### *  Required activities are not included
#### *  Not prper start and End activities
#### * Cases with duraion = 0
#### * Accumulated values = 0 

<img src='1.jpg'>
https://images.app.goo.gl/CFJoBFtvU3wruhUQ6

In [None]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# IMPORTING

In [1]:
from pm4py.objects.log.importer.xes import importer as xes_importer
log = xes_importer.apply('./data/3way_after.xes')

In [2]:
print(
    "Number of trace", len(log),
    "\nNumber of events", sum([len(trace) for trace in log])
)

Number of trace 4455 
Number of events 112146


# FILTERING 

## 1. Filtering by variants

<b> CASE I  </b>
* Create Purchase Order Item  
* Clear Invoice
* Record Goods Receipt  
    
     
<b> CASE II </b>
* 'Create Purchase Order Item'  
* 'Delete Purchase Order Item'  


In [6]:
from pm4py.algo.filtering.log.variants import variants_filter
variants = variants_filter.get_variants(log)

In [7]:
from pm4py.statistics.traces.log import case_statistics
variants_count = case_statistics.get_variant_statistics(log)
variants_count = sorted(variants_count, key=lambda x: x['count'], reverse=True)

In [8]:
# Filtering out
search=[]
for variant in variants_count :
    try : 
        location_PO=variant['variant'].rfind('Create Purchase Order Item')
        location_invoice = variant['variant'].rfind('Clear Invoice')
        location_goodreceipt=variant['variant'].rfind('Record Goods Receipt')
        location_delete = variant['variant'].rfind('Delete Purchase Order Item')
        # case I : PO, goodreceipt, invoice clear
        if (location_invoice>-1) and (location_goodreceipt>-1) and (location_PO>-1):  
            search.append(variant['variant'])
            
        # case II : PO, delete 
        # PO exist and the last activity is Delete Purchase Order Item (==Cancellation)
        if (location_PO>-1) and (len(variant['variant']) == location_delete + len('Delete Purchase Order Item')) : 
            search.append(variant['variant'])
    except:
        pass

In [9]:
log = variants_filter.apply(log, search)
print("The number of filtered traces : ", len(log))

The number of filtered traces :  3259


## 2. Filtering by Start and End activities 

#### The below are not regarded as COMPLETE  

Traces starting with 'Vendor creates invoice'  
Traces ending with 'Remove Payment Block' 

In [10]:
from pm4py.algo.filtering.log.start_activities import start_activities_filter
from pm4py.algo.filtering.log.end_activities import end_activities_filter

log_start = start_activities_filter.get_start_activities(log)
log_end = end_activities_filter.get_end_activities(log)

In [11]:
log_start

{'Create Purchase Order Item': 3124,
 'Create Purchase Requisition Item': 49,
 'Vendor creates invoice': 86}

In [12]:
log_end

{'Clear Invoice': 3129,
 'Delete Purchase Order Item': 129,
 'Remove Payment Block': 1}

In [13]:
log = start_activities_filter.apply(log, ['Create Purchase Order Item','Create Purchase Requisition Item' ])
log = end_activities_filter.apply(log, ['Clear Invoice', 'Delete Purchase Order Item'])
print("The number of filtered traces : ", len(log))

The number of filtered traces :  3172


## 3. Filtering by case_performance =0  
this means the durations of the traces are 0, which are logically wrong. 

In [14]:
from pm4py.algo.filtering.log.cases import case_filter
log = case_filter.filter_on_case_performance(log, 1, float('inf'))

In [15]:
print("The number of filtered traces : ", len(log))

The number of filtered traces :  3172


## 4. Filtering by cumulative_neet_worth_(EUR) =0

The cases with their costs are 0 are not valid. Especially the types requiring "clear invoice". 

In [16]:
from pm4py.util import constants
from pm4py.algo.filtering.log.attributes import attributes_filter
log = attributes_filter.apply(
    log, ["0.0"],
    parameters = {constants.PARAMETER_CONSTANT_ATTRIBUTE_KEY: "Cumulative_net_worth_(EUR)", "positive": False}
)

In [17]:
print("The number of filtered traces : ", len(log))

The number of filtered traces :  3170


# EXPORTING THE FILTERED LOG 

In [18]:
from pm4py.objects.log.exporter.xes import factory as xes_exporter
xes_exporter.export_log(log, "filtered_3way_after.xes")

  
