# Processing Call Records for Insights

#### The goal is to develop a robust Python script to process and analyze telephony data stored in CSV file(s) within a specified folder. This script should perform the following tasks:

#### Data Ingestion:

Read one or multiple CSV files from a given folder path.

Handle cases where the folder contains no CSV files or only one CSV file.

If multiple CSV files are present, attempt to concatenate them into a single Pandas DataFrame, but only if all files have the same column structure.

Implement error handling to gracefully skip files with parsing errors and provide informative warnings.

#### Data Transformation and Cleaning (for specific call details):

For specific call-related columns ("Agent Call Duration", "Agent Call Pulses", "Agent Call Cost (INR)", "Call Recording Duration", "Call Recording Pulses", "Call Recording Cost (INR)", "Call Duration", "Call Pulses", "Call Cost (INR)"), which may contain multiple values separated by pipe ("|"):

   > Identify and Exclude Rows with Unequal Pipes: Identify and remove rows where the number of pipe-separated values in the corresponding duration, pulses, and cost columns are not equal for a given call type (Agent, Recording, Call). Log the 'Unicon UUID' of these skipped rows.
    
   > Split Pipe-Separated Values: Split the string values in the duration, pulses, and cost columns into individual numerical values.
   
   > Reshape Data: Transform the DataFrame such that each individual duration, pulse, and cost value from the split columns becomes a separate row, while retaining the corresponding 'Unicon UUID'.

#### Data Filtering (for split call details):

For each of the split call detail DataFrames (Agent, Recording, Call), filter the data to:

   > Identify records where the duration is greater than zero AND the pulses are zero.
   
   > Identify records where the duration is zero AND the pulses are greater than zero.

#### Data Aggregation:

   For the original DataFrame (before splitting), calculate the sum of specific numerical columns related to costs, counts, and units (e.g., "SMSs Sent", "SMSs Cost (INR)", "Emails Cost (INR)", "STT Cost (INR)", "TTS Cost (INR)", "Total Additional Services Cost (INR)", "Total Cost (INR)"). Handle potential non-numeric values by coercing them to numeric (resulting in NaN) and then summing, while providing warnings for columns that cannot be converted.
   
   For each of the split call detail DataFrames (Agent, Recording, Call), calculate the total duration, total pulses, and total cost.

#### Data Persistence:

Store the following processed DataFrames and aggregated results into a SQLite database named "Total.db":

   >The 'Unicon UUID' of rows skipped due to unequal pipe counts for each call type.

   >The DataFrames containing rows with pipe-separated values (before splitting) for each call type.

   >The final split DataFrames for Agent, Recording, and Call details.

   >The filtered DataFrames (duration > 0 & pulses = 0, and duration = 0 & pulses > 0) for each call type.

   >A combined DataFrame containing the total duration, pulses, and cost for each call type, along with the aggregated totals from the original DataFrame. The aggregated totals should be transposed and labeled appropriately.




In summary, the script needs to efficiently read and process telephony data from CSV files, handle complex string splitting and reshaping for specific columns, perform filtering and aggregation, and finally persist the processed information into a structured SQLite database for further analysis. The script should also include appropriate error handling and informative warnings throughout the process.

In [84]:
import os
import pandas as pd
import sqlite3

def process_csv_folder(folder_path):
    """
    Reads and processes CSV files from a specified folder.

    Args:
        folder_path (str): The path to the folder containing CSV files.

    Returns:
        pandas.DataFrame or None: A DataFrame containing the combined data from the CSV files,
                                   or None if no valid CSV files are found.
    """
    # List all files in the folder that end with ".csv"
    csv_files = [
        filename for filename in os.listdir(folder_path) if filename.endswith(".csv")
    ]

    # Handle the case where no CSV files are found
    if not csv_files:
        print("Folder has no CSV files.")
        return None

    # Handle the case where only one CSV file is present
    if len(csv_files) == 1:
        csv_file = os.path.join(folder_path, csv_files[0])
        return pd.read_csv(csv_file)

    # Initialize an empty list to store DataFrames read from each CSV file
    dfs = []
    for filename in csv_files:
        file_path = os.path.join(folder_path, filename)
        try:
            # Attempt to read the CSV file into a DataFrame
            df = pd.read_csv(file_path)
            dfs.append(df)
        except pd.errors.ParserError:
            # Catch and warn about parsing errors, then skip the problematic file
            print(f"Warning: Skipping DataFrame from '{filename}' due to parsing error.")
            continue

    # If we have successfully read at least one CSV file
    if dfs:
        # Check if all DataFrames have the same columns
        if all(df.columns.equals(dfs[0].columns) for df in dfs[1:]):
            # Concatenate all DataFrames into a single DataFrame and reset the index
            return pd.concat(dfs, ignore_index=True)
        else:
            # Warn if the CSV files have different columns and skip concatenation
            print("Warning: CSV files with different columns. Skipping concatenation.")
            return None
    else:
        # If no valid CSV files were found
        print("No valid CSV files found.")
        return None

def extract_columns_and_convert_to_string(df, columns):
    """
    Extracts specified columns from a DataFrame and converts their values to strings.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of column names to extract.

    Returns:
        pandas.DataFrame: A new DataFrame containing the specified columns with string values.
    """
    return df[columns].astype(str)

def identify_rows_with_unequal_pipes(df, columns):
    """
    Identifies and removes rows where the count of '|' in specified columns is not equal.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names to check for equal '|' counts.
                        It assumes the first column is a unique identifier.

    Returns:
        tuple: A tuple containing the modified DataFrame (rows with unequal pipes removed)
               and a list of unique identifiers of the dropped rows.
    """
    rows_to_skip = []
    # Iterate through each row of the DataFrame
    for index, row in df.iterrows():
        # Extract the values from the specified columns
        uid, val1, val2, val3 = row[columns[0]], row[columns[1]], row[columns[2]], row[columns[3]]
        # Check if the count of '|' is the same in all three value columns
        if val1.count("|") != val2.count("|") or val2.count("|") != val3.count("|") or val1.count("|") != val3.count("|"):
            # If not equal, add the unique identifier to the list of rows to skip
            rows_to_skip.append(uid)

    # If there are rows to skip, filter the DataFrame
    if rows_to_skip:
        # Keep only the rows where the unique identifier is NOT in the list of rows to skip
        df = df[~df[columns[0]].isin(rows_to_skip)]
        print(f"Dropped rows with Unequal number of '|' for IDs: {rows_to_skip}")

    return df, rows_to_skip

def split_values_by_pipe(df, columns):
    """
    Filters a DataFrame to include only rows where at least one of the specified columns contains a '|'.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names to check for the presence of '|'.
                        It assumes the first column is a unique identifier.

    Returns:
        pandas.DataFrame: A new DataFrame containing only the rows with '|' in the specified columns.
    """
    # Filter the DataFrame based on whether '|' is present in any of the specified value columns
    df_with_pipe_values = df[df.apply(lambda row: '|' in row[columns[1]] or '|' in row[columns[2]] or '|' in row[columns[3]], axis=1)]
    print(f"{len(df_with_pipe_values)} has at least one '|'") # Corrected the print statement
    return df_with_pipe_values

def split_df(df, columns):
    """
    Splits rows of a DataFrame based on '|' delimiter in specified columns into multiple rows.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names. The first is assumed to be 'Unicon UUID',
                        and the subsequent three are the columns to split.

    Returns:
        pandas.DataFrame: A new DataFrame with rows split based on '|'.
    """
    # Check if the 'Unicon UUID' column exists
    if 'Unicon UUID' not in df.columns:
        print("Error: 'Unicon UUID' column not found.")
        return pd.DataFrame()

    # Initialize an empty DataFrame to store the results
    result_df = pd.DataFrame(columns=['Unicon UUID', columns[1], columns[2], columns[3]])

    def split_and_append(row):
        """
        Helper function to split values in a row and create a temporary DataFrame.

        Args:
            row (pandas.Series): A row of the input DataFrame.

        Returns:
            pandas.DataFrame: A temporary DataFrame with split values.
        """
        unicon_uuid = row['Unicon UUID']
        # Split the string values by '|' and convert them to numeric, coercing errors to 0
        durations = pd.to_numeric([float(x) if x.replace('.', '', 1).isdigit() else 0 for x in row[columns[1]].split('|')], errors='coerce')
        pulses = pd.to_numeric([float(x) if x.replace('.', '', 1).isdigit() else 0 for x in row[columns[2]].split('|')], errors='coerce')
        costs = pd.to_numeric([float(x) if x.replace('.', '', 1).isdigit() else 0 for x in row[columns[3]].split('|')], errors='coerce')

        # Determine the minimum number of values across the three split lists
        num_values = min(len(durations), len(pulses), len(costs))

        # Create a temporary DataFrame with the split values
        df_temp = pd.DataFrame({
            'Unicon UUID': [unicon_uuid] * num_values,
            columns[1]: durations[:num_values],
            columns[2]: pulses[:num_values],
            columns[3]: costs[:num_values]
        })
        return df_temp

    # Apply the split_and_append function to each row and concatenate the resulting DataFrames
    result_df = pd.concat(df.apply(split_and_append, axis=1).tolist(), ignore_index=True)
    return result_df

def filter_duration_greater_than_zero(df, columns):
    """
    Filters a DataFrame to include rows where the duration is greater than 0 and pulses are 0.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names. The second is assumed to be duration,
                        and the third is assumed to be pulses.

    Returns:
        pandas.DataFrame: A new DataFrame with the filtered rows.
    """
    filtered_df = df[(df[columns[1]] > 0) & (df[columns[2]] == 0)]
    return filtered_df

def filter_pulse_greater_than_zero(df, columns):
    """
    Filters a DataFrame to include rows where the duration is 0 and pulses are greater than 0.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names. The second is assumed to be duration,
                        and the third is assumed to be pulses.

    Returns:
        pandas.DataFrame: A new DataFrame with the filtered rows.
    """
    filtered_df = df[(df[columns[1]] == 0) & (df[columns[2]] > 0)]
    return filtered_df

def calculate_total_and_convert_to_float(df0):
    """
    Calculates the sum of specified numerical columns in a DataFrame.

    Args:
        df0 (pandas.DataFrame): The input DataFrame.

    Returns:
        pandas.Series: A Series containing the sum of each specified column.
                       Returns an empty Series if required columns are missing or cannot be converted.
    """
    # List of columns to calculate the total for
    selected_columns = [
        "Unicon UUID", "SMSs Sent", "SMSs Unit", "SMSs Cost (INR)", "Emails Unit",
        "Emails Cost (INR)", "STT Call Pulses", "STT Cost (INR)", "TTS Call Pulses",
        "TTS Cost (INR)", "Total Additional Services Cost (INR)", "Total Cost (INR)"
    ]

    # Check if all required columns exist in the DataFrame
    if not all(col in df0.columns for col in selected_columns):
        print("Warning: One or more required columns not found for total calculation.")
        return pd.Series()

    # Create a copy of the DataFrame with only the selected columns
    selected_df = df0[selected_columns].copy()

    # Iterate through the columns (excluding the 'Unicon UUID') to convert to numeric
    for column in selected_df.columns[1:]:
        try:
            # Attempt to convert column values to numeric, errors will result in NaN
            selected_df[column] = pd.to_numeric(selected_df[column], errors='coerce')
            # Remove rows where the conversion resulted in NaN for this column
            selected_df.dropna(subset=[column], inplace=True)
        except ValueError:
            # Warn if the column cannot be converted to float and skip it
            print(f"Warning: Cannot convert values to float in column '{column}'. Skipping column.")
            selected_df.drop(column, axis=1, inplace=True, errors='ignore')

    # Calculate the sum of the numeric columns (excluding 'Unicon UUID')
    if selected_df.shape[1] > 1:
        totals = selected_df.iloc[:, 1:].sum()
    else:
        totals = pd.Series()

    return totals

def calculate_total(df, columns):
    """
    Calculates the total duration, pulses, and cost from a DataFrame.

    Args:
        df (pandas.DataFrame): The input DataFrame.
        columns (list): A list of four column names. The second is duration,
                        the third is pulses, and the fourth is cost.

    Returns:
        tuple: A tuple containing the total duration, total pulses, and total cost.
    """
    total_duration = df[columns[1]].sum()
    total_pulses = df[columns[2]].sum()
    total_cost = df[columns[3]].sum()
    return total_duration, total_pulses, total_cost

def store_to_sqlite(data, table_name):
    """
    Stores a Pandas DataFrame (or data that can be converted to one) into a SQLite database.

    Args:
        data (pandas.DataFrame or other): The data to store.
        table_name (str): The name of the table in the SQLite database.
    """
    # Connect to the SQLite database (creates the file if it doesn't exist)
    conn = sqlite3.connect('Total.db')
    cursor = conn.cursor()

    # Ensure the data is a Pandas DataFrame
    if not isinstance(data, pd.DataFrame):
        try:
            data = pd.DataFrame(data)
        except Exception as e:
            print(f"Error: Unable to convert data to DataFrame - {e}")
            conn.close()
            return

    # Write the DataFrame to the SQLite table, replacing the table if it already exists
    # 'index=True' adds the DataFrame index as a column named 'field'
    data.to_sql(table_name, conn, if_exists='replace', index=True, index_label='field')

    # Commit the changes to the database
    conn.commit()
    # Close the database connection
    conn.close()

if __name__ == "__main__":
    
    # Example of how to process CSV files from a folder
    folder_path = r"/Users/black_pearl/Desktop/Internship/Python Phonon work/"
    df0 = process_csv_folder(folder_path)
    '''
    # Sample data as a dictionary to simulate reading from a CSV
    data = {
        "Unicon UUID": ["uuid1", "uuid2", "uuid3"],
        "Agent Call Duration": ["1", "40|50", "60|70"],
        "Agent Call Pulses": ["5", "8|9", "10|11"],
        "Agent Call Cost (INR)": ["100|200|300", "400|500", "600|700"],
        "Call Recording Duration": ["15|25|35|-|45", "45|55|-", "65|75|85|95"],
        "Call Recording Pulses": ["7|8|9|-|10", "10|11|-", "12|13|14|15"],
        "Call Recording Cost (INR)": ["150|250|350|-|450", "450|550|-", "650|750|850|950"],
        "Call Duration": ["20|30|40|-|50", "50|60|-", "70|80|90|100"],
        "Call Pulses": ["9|10|11|-|12", "12|13|-", "14|15|16|17"],
        "Call Cost (INR)": ["200|300|400|-|500", "500|600|-", "700|800|900|1000"],
        "SMSs Sent": [1, 2, 3],
        "SMSs Unit": ["unit1", "unit2", "unit3"],
        "SMSs Cost (INR)": [10, 20, 30],
        "Emails Unit": ["unit4", "unit5", "unit6"],
        "Emails Cost (INR)": [40, 50, 60],
        "STT Call Pulses": [7, 8, 9],
        "STT Cost (INR)": [70, 80, 90],
        "TTS Call Pulses": [4, 5, 6],
        "TTS Cost (INR)": [100, 110, 120],
        "Total Additional Services Cost (INR)": [130, 140, 150],
        "Total Cost (INR)": [160, 170, 180]
    }
    df0 = pd.DataFrame(data)
    '''

    # Proceed only if a DataFrame was successfully loaded
    if df0 is not None:
        columns_agent = ["Unicon UUID", "Agent Call Duration", "Agent Call Pulses", "Agent Call Cost (INR)"]
        columns_recording = ["Unicon UUID", "Call Recording Duration", "Call Recording Pulses", "Call Recording Cost (INR)"]
        columns_call = ["Unicon UUID", "Call Duration", "Call Pulses", "Call Cost (INR)"]

       # Process Agent call data
        Agent = extract_columns_and_convert_to_string(df0.copy(), columns_agent)
        len_Agent = len(Agent)
        print(f"Initial length of Agent DataFrame: {len_Agent}")
        Agent, rows_to_skipA = identify_rows_with_unequal_pipes(Agent.copy(), columns_agent)
        df_with_pipe_valuesA = split_values_by_pipe(Agent.copy(), columns_agent)
        Agent = split_df(Agent.copy(), columns_agent)
        print(f"Length of Agent DataFrame after splitting: {len(Agent)}")
        filter_duration_greater_than_zeroA = filter_duration_greater_than_zero(Agent.copy(), columns_agent)
        filter_pulse_greater_than_zeroA = filter_pulse_greater_than_zero(Agent.copy(), columns_agent)

        # Process Recording call data
        Recording = extract_columns_and_convert_to_string(df0.copy(), columns_recording)
        len_Recording = len(Recording)
        print(f"Initial length of Recording DataFrame: {len_Recording}")
        Recording, rows_to_skipR = identify_rows_with_unequal_pipes(Recording.copy(), columns_recording)
        df_with_pipe_valuesR = split_values_by_pipe(Recording.copy(), columns_recording)
        Recording = split_df(Recording.copy(), columns_recording)
        filter_duration_greater_than_zeroR = filter_duration_greater_than_zero(Recording.copy(), columns_recording)
        filter_pulse_greater_than_zeroR = filter_pulse_greater_than_zero(Recording.copy(), columns_recording)

        # Process Call data
        Call = extract_columns_and_convert_to_string(df0.copy(), columns_call)
        len_Call = len(Call)
        print(f"Initial length of Call DataFrame: {len_Call}")
        Call, rows_to_skipC = identify_rows_with_unequal_pipes(Call.copy(), columns_call)
        df_with_pipe_valuesC = split_values_by_pipe(Call.copy(), columns_call)
        Call = split_df(Call.copy(), columns_call)
        filter_duration_greater_than_zeroC = filter_duration_greater_than_zero(Call.copy(), columns_call)
        filter_pulse_greater_than_zeroC = filter_pulse_greater_than_zero(Call.copy(), columns_call)

        # Calculate totals for each call type
        total_agent_duration, total_agent_pulses, total_agent_cost = calculate_total(Agent.copy(), columns_agent)
        total_recording_duration, total_recording_pulses, total_recording_cost = calculate_total(Recording.copy(), columns_recording)
        total_call_duration, total_call_pulses, total_call_cost = calculate_total(Call.copy(), columns_call)

        # Create DataFrames for the totals of each call type
        totals_agents = pd.DataFrame({
            "Agent Call Duration": [total_agent_duration],
            "Agent Call Pulses": [total_agent_pulses],
            "Agent Call Cost (INR)": [total_agent_cost]
        }).T

        totals_recording = pd.DataFrame({
            "Call Recording Duration": [total_recording_duration],
            "Call Recording Pulses": [total_recording_pulses],
            "Call Recording Cost (INR)": [total_recording_cost]
        }).T

        totals_call = pd.DataFrame({
            "Call Duration": [total_call_duration],
            "Call Pulses": [total_call_pulses],
            "Call Cost (INR)": [total_call_cost]
        }).T

        # Calculate the sum of specific columns from the original DataFrame
        totals = pd.DataFrame(calculate_total_and_convert_to_float(df0.copy()))
        combined_df = pd.concat([totals_agents, totals_recording, totals_call, totals], axis=0, ignore_index=False)
        combined_df.columns = ['total']

        """
        # Store processed DataFrames and skipped row IDs to SQLite
        store_to_sqlite(pd.DataFrame({'UID': rows_to_skipA}), "rows_to_skipA")
        store_to_sqlite(pd.DataFrame({'UID': rows_to_skipR}), "rows_to_skipR")
        store_to_sqlite(pd.DataFrame({'UID': rows_to_skipC}), "rows_to_skipC")

        store_to_sqlite(df_with_pipe_valuesA, "df_with_pipe_valuesA")
        store_to_sqlite(df_with_pipe_valuesR, "df_with_pipe_valuesR")
        store_to_sqlite(df_with_pipe_valuesC, "df_with_pipe_valuesC")

        store_to_sqlite(Agent, "Agent")
        store_to_sqlite(Recording, "Recording")
        store_to_sqlite(Call, "Call")

        store_to_sqlite(filter_duration_greater_than_zeroA, "filter_duration_greater_than_zeroA")
        store_to_sqlite(filter_pulse_greater_than_zeroA, "filter_pulse_greater_than_zeroA")

        store_to_sqlite(filter_duration_greater_than_zeroR, "filter_duration_greater_than_zeroR")
        store_to_sqlite(filter_pulse_greater_than_zeroR, "filter_pulse_greater_than_zeroR")

        store_to_sqlite(filter_duration_greater_than_zeroC, "filter_duration_greater_than_zeroC")
        store_to_sqlite(filter_pulse_greater_than_zeroC, "filter_pulse_greater_than_zeroC")

        store_to_sqlite(combined_df, "total")
        """
    else:
        print("No CSV data processed.")

Initial length of Agent DataFrame: 20
8 has at least one '|'
Length of Agent DataFrame after splitting: 30
Initial length of Recording DataFrame: 20
Dropped rows with Unequal number of '|' for IDs: ['UUID_D1']
7 has at least one '|'
Initial length of Call DataFrame: 20
Dropped rows with Unequal number of '|' for IDs: ['UUID_D1']
8 has at least one '|'


In [85]:
rows_to_skipA

[]

In [86]:
df_with_pipe_valuesA

Unnamed: 0,Unicon UUID,Agent Call Duration,Agent Call Pulses,Agent Call Cost (INR)
0,UUID_A1,10|20,5|10,1|2
3,UUID_B2,40|50|60,20|25|30,2|2.5|3
5,UUID_C2,15|18,8|9,1.5|1.8
8,UUID_E1,45|55,22|27,2.2|2.7
12,UUID_G1,70|80,35|40,7|8
14,UUID_H1,200|220|240,100|110|120,20|22|24
17,UUID_I2,120|130,60|65,12|13
19,UUID_J2,150|160,75|80,15|16


In [87]:
Agent

Unnamed: 0,Unicon UUID,Agent Call Duration,Agent Call Pulses,Agent Call Cost (INR)
0,UUID_A1,10.0,5.0,1.0
1,UUID_A1,20.0,10.0,2.0
2,UUID_A2,0.0,0.0,0.0
3,UUID_B1,30.0,15.0,3.0
4,UUID_B2,40.0,20.0,2.0
5,UUID_B2,50.0,25.0,2.5
6,UUID_B2,60.0,30.0,3.0
7,UUID_C1,25.0,12.0,2.5
8,UUID_C2,15.0,8.0,1.5
9,UUID_C2,18.0,9.0,1.8


In [88]:
filter_duration_greater_than_zeroA

Unnamed: 0,Unicon UUID,Agent Call Duration,Agent Call Pulses,Agent Call Cost (INR)


In [89]:
filter_pulse_greater_than_zeroA

Unnamed: 0,Unicon UUID,Agent Call Duration,Agent Call Pulses,Agent Call Cost (INR)


In [90]:
rows_to_skipR

['UUID_D1']

In [91]:
df_with_pipe_valuesR

Unnamed: 0,Unicon UUID,Call Recording Duration,Call Recording Pulses,Call Recording Cost (INR)
2,UUID_B1,60|70,30|35,6|7
4,UUID_C1,120|130,60|65,12|13
8,UUID_E1,240|250,120|125,24|25
12,UUID_G1,360|370,180|185,36|37
15,UUID_H2,450|460,225|230,45|46
17,UUID_I2,510|520,255|260,51|52
19,UUID_J2,570|580,285|290,57|58


In [92]:
Recording

Unnamed: 0,Unicon UUID,Call Recording Duration,Call Recording Pulses,Call Recording Cost (INR)
0,UUID_A1,30.0,15.0,3.0
1,UUID_A2,0.0,0.0,0.0
2,UUID_B1,60.0,30.0,6.0
3,UUID_B1,70.0,35.0,7.0
4,UUID_B2,90.0,45.0,9.0
5,UUID_C1,120.0,60.0,12.0
6,UUID_C1,130.0,65.0,13.0
7,UUID_C2,150.0,75.0,15.0
8,UUID_D2,210.0,105.0,21.0
9,UUID_E1,240.0,120.0,24.0


In [93]:
filter_duration_greater_than_zeroR

Unnamed: 0,Unicon UUID,Call Recording Duration,Call Recording Pulses,Call Recording Cost (INR)


In [94]:
filter_pulse_greater_than_zeroR

Unnamed: 0,Unicon UUID,Call Recording Duration,Call Recording Pulses,Call Recording Cost (INR)


In [95]:
rows_to_skipC

['UUID_D1']

In [96]:
df_with_pipe_valuesC

Unnamed: 0,Unicon UUID,Call Duration,Call Pulses,Call Cost (INR)
0,UUID_A1,15|25,7|12,1.5|2.5
3,UUID_B2,5|10,2|5,0.5|1
7,UUID_D2,20|22,10|11,2|2.2
9,UUID_E2,30|33,15|16,3|3.3
10,UUID_F1,60|70|80,30|35|40,6|7|8
13,UUID_G2,50|55,25|27,5|5.5
15,UUID_H2,80|85,40|42,8|8.5
18,UUID_J1,100|110,50|55,10|11


In [97]:
Call

Unnamed: 0,Unicon UUID,Call Duration,Call Pulses,Call Cost (INR)
0,UUID_A1,15.0,7.0,1.5
1,UUID_A1,25.0,12.0,2.5
2,UUID_A2,40.0,20.0,4.0
3,UUID_B1,0.0,0.0,0.0
4,UUID_B2,5.0,2.0,0.5
5,UUID_B2,10.0,5.0,1.0
6,UUID_C1,35.0,18.0,3.5
7,UUID_C2,10.0,5.0,1.0
8,UUID_D2,20.0,10.0,2.0
9,UUID_D2,22.0,11.0,2.2


In [98]:
filter_duration_greater_than_zeroC

Unnamed: 0,Unicon UUID,Call Duration,Call Pulses,Call Cost (INR)


In [99]:
filter_pulse_greater_than_zeroC

Unnamed: 0,Unicon UUID,Call Duration,Call Pulses,Call Cost (INR)
27,UUID_J2,0.0,1.0,0.0
