# 📊 Completeness Check Notebook

This notebook checks the completeness (missing or present data) of every cell in a CSV file. It uses the **Excel-style row numbers** (i.e., 2 for the first data row assuming header is in row 1) as the row identifier.

In [1]:

# Step 1: Import Libraries
import pandas as pd


In [2]:

# Step 2: Load the CSV file (same folder as this notebook)
file_path = 'Planview Data.csv'
df = pd.read_csv(file_path)


  df = pd.read_csv(file_path)


In [3]:

# Step 3: Create Excel-style row index (starts at 2 because row 1 is header)
df.index = df.index + 2  # Excel row numbers start at 2
df.index.name = 'Index (Row number you created earlier)'


In [4]:

# Step 4: Melt DataFrame to long format
melted_df = df.reset_index().melt(id_vars='Index (Row number you created earlier)', 
                                  var_name='Column Name', 
                                  value_name='Value')


In [5]:

# Step 5: Apply completeness check: 1 = complete, 0 = missing
melted_df['Complete (1 or 0)'] = melted_df['Value'].apply(
    lambda x: 1 if pd.notnull(x) and str(x).strip() != '' else 0
)


In [6]:

# Step 6: Add sequential Sr# column
melted_df.insert(0, 'Sr#', range(1, len(melted_df) + 1))


In [7]:

# Step 7: Select required columns
final_df = melted_df[['Sr#', 'Column Name', 'Index (Row number you created earlier)', 'Complete (1 or 0)']]


In [8]:

# Step 8: Split into 4 equal parts and export
num_parts = 4
chunk_size = len(final_df) // num_parts

for i in range(num_parts):
    start = i * chunk_size
    end = (i + 1) * chunk_size if i < num_parts - 1 else len(final_df)
    part_df = final_df.iloc[start:end]
    part_df.to_csv(f'completeness_part_{i+1}.csv', index=False)

print("✅ Completed! Exported 4 CSV files for completeness check.")


✅ Completed! Exported 4 CSV files for completeness check.
