# Introducing Aetus
In real world there are multiple AI models helping consumers and industries. Some of the examples include   
*   **Recommendation Systems:** Netflix and Amazon use these to suggest movies or products you might like based on your viewing/purchase history.
*   **Spam Filters:** Email providers use machine learning to classify emails as spam or not.
*   **Fraud Detection:** Banks employ ML to identify potentially fraudulent transactions.
*   **Medical Diagnosis:** AI models assist doctors in analyzing medical images (X-rays, MRIs) for diseases.
*   **Natural Language Processing:** Google Translate and chatbots use ML to understand and generate human language.

Aetus is a governance module ensuring the safety and security of these models. Aetus typically ensures some of the following threats are prevented

*   **Data Poisoning:** Attackers can manipulate training data to introduce biases or backdoors.
*   **Model Theft:** Competitors or malicious actors can steal a model's architecture and parameters, replicating its functionality without investing in development.
*   **Unauthorized Access:** Without proper authorization, individuals or systems may gain access to sensitive models, leading to misuse, data breaches, or intellectual property theft.
*   **Adversarial Attacks:** Attackers can craft subtle input perturbations that cause the model to make incorrect predictions without being easily detectable.
*   **Membership Inference Attacks:** Attackers can infer whether a specific data point was part of the training data, potentially revealing sensitive information.
*   **Model Extraction Attacks:** Attackers can train a substitute model by querying the target model, gradually replicating its functionality.
*   **Backdoor Attacks:** Attackers can insert hidden triggers into the model that cause it to behave maliciously under specific conditions.
*   **Data Leakage:** Models can unintentionally leak sensitive information about the training data through their predictions or internal representations.
*   **Denial-of-Service Attacks:** Attackers can flood the model with requests, making it unavailable to legitimate users.


### Installing required libraries

In [None]:
!pip install plyer
!pip install cryptography


Defaulting to user installation because normal site-packages is not writeable

    pytz (>dev)
         ~^

[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip



Defaulting to user installation because normal site-packages is not writeable


    pytz (>dev)
         ~^

[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Importing required libraries

In [1]:
import os
import pandas as pd
import numpy as np
import hashlib
import json
import base64
from plyer import notification
from cryptography.fernet import Fernet
import pickle
from pprint import pprint
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import padding, rsa
from cryptography.hazmat.primitives import serialization
from sklearn.ensemble import IsolationForest
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

ModuleNotFoundError: No module named 'plyer'

### Aetus as an very powerful UI which ensures admin can configure all the required settings to safeguard the models
Here are the configurations supported by Aetus

The below carousel shows all the model configurations

In [None]:
import numpy as np
import plotly.express as px
from imageio import imread
import imageio.v2 as imageio
from PIL import Image
import xarray as xr

# Function to resize images to the same shape (height, width)
def resize_images(images, target_size=(850, 850)):
    resized_images = []
    for img in images:
        pil_img = Image.fromarray(img)  # Convert numpy array to PIL Image
        resized_img = pil_img.resize(target_size)  # Resize image
        resized_images.append(np.array(resized_img))  # Convert back to numpy array
    return resized_images

# Show a list of numpy images as a carousel
# key is the label of the slider
def show_images_carousel(images, labels, key: str, title: str, height:int, width:int):        
    stacked = np.stack(images, axis=0)
    xrData = xr.DataArray(
        data   = stacked,
        dims   = [key, 'row', 'col', 'rgb'],
        coords = {key: labels}
    )
    # Hide the axes and set the layout size (increased width, reduced height)
    layout_dict = dict(
        yaxis_visible=False, 
        yaxis_showticklabels=False, 
        xaxis_visible=False, 
        xaxis_showticklabels=False,
        height=height,
        width=width
    )
    return px.imshow(xrData, title=title, animation_frame=key).update_layout(layout_dict)

# Show a list of URLs as a carousel, loading them as numpy images first
def show_images_carousel_from_urls(image_urls, labels, key: str, title:str, height:int, width:int):
    images = [imread(url, pilmode='RGB') for url in image_urls]
    images = resize_images(images)  # Resize images to the same shape
    return show_images_carousel(images, labels, key, title, height, width)

# Correct the dictionary with unique file paths for each image
images = {
    'D:/Aetus/Aetus/screenshots/DataSecurity.png': 'Data Security',
    'D:/Aetus/Aetus/screenshots/ModelDevelopment.png': "Model Development",
    'D:/Aetus/Aetus/screenshots/ModelInfrastructure.png': "Model Infrastructure",
    'D:/Aetus/Aetus/screenshots/ModelSecurity.png': "Model Security",
    'D:/Aetus/Aetus/screenshots/ModelUsage.png': 'Model Usage',
    'D:/Aetus/Aetus/screenshots/ModelMonitoring.png': 'Model Monitoring',   
}

# Adjust height and width as per your preference (increased width, reduced height)
fig = show_images_carousel_from_urls(list(images.keys()), list(images.values()), 'Method', None, 800, 700)  # Increased width, reduced height
fig.show()






In [None]:
GovernanceConfiguration = open(r"HeartAttackDetection.csv-Classification-Aug22-23-16-29-54.pkl.json")

contents= json.load(GovernanceConfiguration)
pprint(contents)

{'dataSecurity': {'detectDataPoisoning': True,
                  'enableDataCataloguePolicy': False,
                  'enableDataMinimisation': False,
                  'enableModelAccessControl': True,
                  'enableModelBackup': True,
                  'preventDataInjections': False,
                  'preventDataInputManipulation': True,
                  'preventDataLabelManipulation': True,
                  'preventLogicCorruption': False},
 'details': {'author': 'ravivarma',
             'category': 'HeartAttackDetection',
             'date': '2024-06-05 00:00:00',
             'name': 'HeartAttackDetection.csv-Classification-Aug22-23-16-29-54.pkl'},
 'infraSturcture': {'certificateSpoofing': False,
                    'deepContractiveNetwork': True,
                    'defendModelDeployment': True,
                    'envCompliance': False,
                    'modelTheft': False,
                    'nodeDeactivation': True,
                    'secureMultiparty

In [None]:
def send_desktop_alert(title, message):
    notification.notify(
            title=title,
            message=message,
            timeout=10
        )

# Data Security -Detect Data Poisoning

In [None]:
if contents['dataSecurity']['detectDataPoisoning']==True:
    def validate_data(data):
        print('validate_data')
    # Check for anomalies or inconsistencies in the data
        if data.isnull().sum().sum() > 0:
            raise ValueError("Missing values detected in the dataset")
    # Add more validation checks as needed
        return True
        try:
            validate_data(data)
        except ValueError as e:
            print(e)

# Data Security - Prevent Data Label Manipulation

In [None]:
if contents['dataSecurity']['preventDataLabelManipulation']==True:
    def encrypt_file(file_path):
        print('encrypt_file')
        key = Fernet.generate_key()
        cipher_suite = Fernet(key)
        with open(file_path, 'rb') as file:
            file_data = file.read()
        encrypted_data = cipher_suite.encrypt(file_data)
        with open(file_path + '.enc', 'wb') as file:
            file.write(encrypted_data)
            print('file encryted')
        with open('encryption_key.key', 'wb') as key_file:
            key_file.write(key)
            print(key)
        return key
    
    def decrypt_file(file_path, key):
        cipher_suite = Fernet(key)
        with open(file_path, 'rb') as file:
            encrypted_data = file.read()
        decrypted_data = cipher_suite.decrypt(encrypted_data)
        with open('decrypted_' + os.path.basename(file_path).replace('.enc', ''), 'wb') as file:
            file.write(decrypted_data)
            print('file dencryted')
    # Encrypt the dataset
    key = encrypt_file('D:/HeartAttackDetection.csv')

    # Decrypt the dataset for usage
    decrypt_file('D:/HeartAttackDetection.csv.enc', key)
    df = pd.read_csv('D:/HeartAttackDetection.csv')
    print('file decrypted')
    print(df.columns)
else:
    
    df = pd.read_csv('D:/HeartAttackDetection.csv')

    print(df.columns)

encrypt_file
file encryted
b'Ug6NUhdpy0cMwXaGay1XPfFn2UvA1Ci_iQsqhrodu4k='
file dencryted
file decrypted
Index(['Age', 'Sex', 'CP_Type', 'BloodPressure', 'Cholestrol', 'BloodSugar',
       'ECG', 'MaxHeartRate', 'ExerciseAngina', 'FamilyHistory', 'Target'],
      dtype='object')


# Model Usage - Detect Adversarial Input

In [None]:
#detecting adversarial imputs
if contents['modelUsage']['detectAdversarialInput']==True :
    
    # Train an Isolation Forest for anomaly detection
    clf = IsolationForest(contamination=0.1)
    clf.fit(X_train)
    
    # Detect anomalies in the test data
    y_pred_anomalies = clf.predict(X_test)
    anomalies = X_test[y_pred_anomalies == -1]
    print(f"Detected anomalies: {len(anomalies)}")
    if len(anomalies)>4:
        send_desktop_alert("potential data poisoning", f"The data has  {len(anomalies)} anamolies")

# Model Developement - Feature Squeezing

In [None]:
if contents['modelDevelopment']['featureSqueezing']==True :
    def feature_squeeze(X):
        return np.round(X, decimals=2)
    
    X_train_squeezed = feature_squeeze(X_train)
    X_test_squeezed = feature_squeeze(X_test)
    Y_train_squeezed = feature_squeeze(y_train)
    Y_test_squeezed = feature_squeeze(y_test)
    print('features squeezed')
    # Initialize the Logistic Regression model
    model = LogisticRegression()

# Train the model
    model.fit(X_train_squeezed, Y_train_squeezed)

# Make predictions on the test set
    y_pred = model.predict(X_test_squeezed)

# Evaluate the model
    accuracy = accuracy_score(Y_test_squeezed, y_pred)
    conf_matrix = confusion_matrix(y_test, y_pred)
    class_report = classification_report(y_test, y_pred)

NameError: name 'X_train' is not defined

# Model Developement -  Regularization

In [None]:
if contents['modelDevelopment']['regularization']==True :
    from keras.regularizers import l2
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.optimizers import Adam
    from keras.losses import KLDivergence
    
    # Add L2 regularization to the logistic regression model
    model = Sequential()
    model.add(Dense(10, input_shape=(X_train.shape[1],), activation='relu', kernel_regularizer=l2(0.01)))
    model.add(Dense(3, activation='softmax'))
    
    model.compile(optimizer=Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
    print('reqularized model')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 57ms/step - accuracy: 0.5227 - loss: 130.9419 - val_accuracy: 0.5246 - val_loss: 123.9128
Epoch 2/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.5241 - loss: 121.5583 - val_accuracy: 0.5246 - val_loss: 113.8647
Epoch 3/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.5112 - loss: 111.1686 - val_accuracy: 0.5246 - val_loss: 103.9305
Epoch 4/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.5872 - loss: 87.4301 - val_accuracy: 0.5246 - val_loss: 94.3570
Epoch 5/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step - accuracy: 0.5401 - loss: 89.8831 - val_accuracy: 0.5246 - val_loss: 84.6571
Epoch 6/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.5104 - loss: 81.4207 - val_accuracy: 0.5246 - val_loss: 75.2928
Epoch 7/50
[1m8/8[0m [32m

# Model Developement - Defensive Distillation

In [None]:
if contents['modelDevelopment']['defensiveDistillation']==True :
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.optimizers import Adam
    from keras.losses import KLDivergence
    
    # Create and train the distilled model
    def train_defensive_distillation(X_train, y_train, X_test, y_test):
        distilled_model = Sequential()
        distilled_model.add(Dense(10, input_shape=(X_train.shape[1],), activation='relu'))
        distilled_model.add(Dense(3, activation='softmax'))
    
        distilled_model.compile(optimizer=Adam(learning_rate=0.001), loss=KLDivergence(), metrics=['accuracy'])
        distilled_model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
        print('distilled model')
    
        return distilled_model
    
    # Train the defensive distillation model
    distilled_model = train_defensive_distillation(X_train, y_train, X_test, y_test)

Epoch 1/50


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 56ms/step - accuracy: 0.4794 - loss: 13.5945 - val_accuracy: 0.2951 - val_loss: 12.3794
Epoch 2/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.3870 - loss: 12.3830 - val_accuracy: 0.2951 - val_loss: 11.9941
Epoch 3/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.4492 - loss: 11.9999 - val_accuracy: 0.3443 - val_loss: 11.4200
Epoch 4/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.4171 - loss: 12.1850 - val_accuracy: 0.3115 - val_loss: 11.1289
Epoch 5/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.4522 - loss: 10.7819 - val_accuracy: 0.3279 - val_loss: 10.6818
Epoch 6/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.3620 - loss: 10.1795 - val_accuracy: 0.3934 - val_loss: 10.3625
Epoch 7/50
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━

# Security - Enable Differntial Privacy

In [None]:
if contents['security']['enableDifferentialPrivacy']==True :
        print("add differential privacy functionality")
        
        from diffprivlib.models import LogisticRegression

        model1 = LogisticRegression(epsilon=1.0, delta=1e-5)

        # Train the model
        model1.fit(X_train, y_train)
        print('this model is trained on differentailly private data')

        # Make predictions on the test set
        y_pred = model1.predict(X_test)
        
        # Evaluate the model
        accuracy = accuracy_score(y_test, y_pred)
        conf_matrix = confusion_matrix(y_test, y_pred)
        class_report = classification_report(y_test, y_pred)
        
        # Print evaluation metrics
        print(f'Accuracy: {accuracy}')
        print('Confusion Matrix:')
        print(conf_matrix)
        print('Classification Report:')
        print(class_report)

add differential privacy functionality
this model is trained on differentailly private data
Accuracy: 0.6557377049180327
Confusion Matrix:
[[16 13]
 [ 8 24]]
Classification Report:
              precision    recall  f1-score   support

           0       0.67      0.55      0.60        29
           1       0.65      0.75      0.70        32

    accuracy                           0.66        61
   macro avg       0.66      0.65      0.65        61
weighted avg       0.66      0.66      0.65        61





# Security - Prevent Weight Tempering

In [None]:
if contents['security']['preventWeightTempering']==True :
    #Prevent Weight Tampering & Model Manipulation
    #Digital Signatures: Sign the model file with a digital signature to ensure its integrity.
    # Generate RSA keys for signing
    private_key = rsa.generate_private_key(
        public_exponent=65537,
        key_size=2048
    )
    public_key = private_key.public_key()
    # Save keys to files
    with open('private_key.pem', 'wb') as private_file:
        private_file.write(private_key.private_bytes(
            encoding=serialization.Encoding.PEM,
            format=serialization.PrivateFormat.TraditionalOpenSSL,
            encryption_algorithm=serialization.NoEncryption()
        ))
    with open('public_key.pem', 'wb') as public_file:
        public_file.write(public_key.public_bytes(
        encoding=serialization.Encoding.PEM,
        format=serialization.PublicFormat.SubjectPublicKeyInfo
    ))
    # Sign the model file
    def sign_model(model_path, private_key):
        with open(model_path, 'rb') as model_file:
            model_data = model_file.read()
        signature = private_key.sign(
            model_data,
            padding.PSS(
                mgf=padding.MGF1(hashes.SHA256()),
                salt_length=padding.PSS.MAX_LENGTH
            ),
            hashes.SHA256()
        )
        with open(model_path + '.sig', 'wb') as sig_file:
            sig_file.write(signature)
            print('file signed')
    # Verify the model file
    def verify_model(model_path, signature_path, public_key):
        with open(model_path, 'rb') as model_file:
            model_data = model_file.read()
        with open(signature_path, 'rb') as sig_file:
            signature = sig_file.read()
        try:
            public_key.verify(
                signature,
                model_data,
                padding.PSS(
                    mgf=padding.MGF1(hashes.SHA256()),
                    salt_length=padding.PSS.MAX_LENGTH
                ),
                hashes.SHA256()
            )
            print("Model verification successful")
        except:
            print("Model verification failed")
    
    # Sign the model
    sign_model('logistic_regression_model.pkl', private_key)

    # Verify the model
    verify_model('logistic_regression_model.pkl', 'logistic_regression_model.pkl.sig', public_key)


file signed
Model verification successful


# Security - Encryption

In [None]:


#enrypting the model file 
if contents['security']['encryption']==True :
   

    def generate_fernet_key(user_key: str) -> bytes:
        # Hash the user key using SHA-256 to get a 32-byte key
        hash_key = hashlib.sha256(user_key.encode()).digest()
        # Base64 encode the hash key to make it a valid Fernet key
        fernet_key = base64.urlsafe_b64encode(hash_key)
        return fernet_key
    
    def encrypt_file(input_file_path: str, output_file_path: str, user_key: str):
        # Generate Fernet key from user key
        fernet_key = generate_fernet_key(user_key)
        # Create a Fernet instance with the key
        fernet = Fernet(fernet_key)
        
        # Read the pickle file as binary
        with open(input_file_path, 'rb') as file:
            data = file.read()
        
        # Encrypt the data
        encrypted_data = fernet.encrypt(data)
        
        # Write the encrypted data to a new file
        with open(output_file_path, 'wb') as file:
            file.write(encrypted_data)
        print(f"File {input_file_path} has been encrypted and saved as {output_file_path}")
    
    def decrypt_file(input_file_path: str, output_file_path: str, user_key: str):
        # Generate Fernet key from user key
        fernet_key = generate_fernet_key(user_key)
        # Create a Fernet instance with the key
        fernet = Fernet(fernet_key)
        
        # Read the encrypted file as binary
        with open(input_file_path, 'rb') as file:
            encrypted_data = file.read()
        
        # Decrypt the data
        decrypted_data = fernet.decrypt(encrypted_data)
        
        # Write the decrypted data back to a pickle file
        with open(output_file_path, 'wb') as file:
            file.write(decrypted_data)
        print(f"File {input_file_path} has been decrypted and saved as {output_file_path}")
    
    # Example usage
    user_key = contents['security']['encryptionKey']
    input_file_path = "model.pkl"
    encrypted_file_path = "encrypted_model.pkl"
    decrypted_file_path = "decrypted_model.pkl"
    
    # Encrypt the pickle file
    encrypt_file(input_file_path, encrypted_file_path, user_key)
    
    # Decrypt the pickle file
    decrypt_file(encrypted_file_path, decrypted_file_path, user_key)

File model.pkl has been encrypted and saved as encrypted_model.pkl
File encrypted_model.pkl has been decrypted and saved as decrypted_model.pkl


# ModelHealth - Enable Alert

In [None]:

if contents['modelHealth']['enableAlert']==True :
    def send_alert(subject, body):
        sender_email = "ravivarmvattam@gmail.com"
        receiver_email = "ravivarmavattam@gmail.com"
        password = ""
    
        message = MIMEMultipart()
        message["From"] = sender_email
        message["To"] = receiver_email
        message["Subject"] = subject
    
        message.attach(MIMEText(body, "plain"))
    
        server = smtplib.SMTP("smtp.gmail.com", 587)
        server.starttls()
        server.login(sender_email, password)
        server.sendmail(sender_email, receiver_email, message.as_string())
        server.quit()

In [None]:
 # Check model health and send alert if necessary
model_accuracy = accuracy_score(y_test, y_pred)
print(model_accuracy)
if(model_accuracy<=0.80):
    print('sending alert')
    send_desktop_alert("Model Accuracy Alert", f"The model accuracy has dropped to {model_accuracy:.2f}")

0.6557377049180327
sending alert


# Data Security - Enable Model Backup

In [None]:
if contents['dataSecurity']['enableModelBackup']==True :
    import dill
    def enableModelBackup():
        path= 'Downloads\\modelbackup'
        backup_path=os.path.join(path,'model.pkl')
        with open(backup_path,'wb')as f:
            dill.dump(model1,f)
            print('successfull')
        with open(backup_path, 'rb') as f:
            loaded_model = dill.load(f)
            print(loaded_model)
    enableModelBackup()

successfull
LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),
                   data_norm=601.1780102432224)


## Zip and unzip a folder

In [None]:
import shutil
import os
import zipfile

def zip_folder(folder_path, destination_path, zip_name="Sales_zip.zip"):
    zip_file_path = os.path.join(destination_path, zip_name)
    shutil.make_archive(zip_file_path.replace(".zip", ""), 'zip', folder_path)
    print(f"Folder has been successfully zipped and saved at {zip_file_path}")

def unzip_folder(zip_file_path, destination_folder, new_folder_name="Sales_unzipped"):
    new_folder_path = os.path.join(destination_folder, new_folder_name)
    os.makedirs(new_folder_path, exist_ok=True)
    with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
        zip_ref.extractall(new_folder_path)

    print(f"Unzipped the file into the folder '{new_folder_path}'")

folder_path = r"D:\Sales"
destination_path = r"D:\Target"
zip_folder(folder_path, destination_path)

zip_file_path = r"D:\Target\Sales_zip.zip" 
destination_folder = r"D:\Target" 
unzip_folder(zip_file_path, destination_folder)


Folder 'D:\Sales' has been successfully zipped and saved at D:\Target\Sales_zip.zip
Unzipped the file 'D:\Target\Sales_zip.zip' into the folder 'D:\Target\Sales_unzipped'


## Hide and unhide a folder

In [None]:
import win32api
import win32con

def hide_folder(folder_path):
    win32api.SetFileAttributes(folder_path, win32con.FILE_ATTRIBUTE_HIDDEN)
    print(f"Folder Hidden '{folder_path}'")

hide_folder("D:\Sales")

Folder Hidden 'D:\Sales'


  hide_folder("D:\Sales")


In [None]:
import win32api
import win32con

def unhide_folder(folder_path):
    attributes = win32api.GetFileAttributes(folder_path)
    if attributes & win32con.FILE_ATTRIBUTE_HIDDEN:
        win32api.SetFileAttributes(folder_path, attributes & ~win32con.FILE_ATTRIBUTE_HIDDEN)
    print(f"Folder Unhidden '{folder_path}'")

unhide_folder("D:\Sales")


Folder Unhidden 'D:\Sales'


  unhide_folder("D:\Sales")


## Lock and unlock a folder

In [None]:
import os

def lock_folder(folder_path):
    os.system(f'icacls "{folder_path}" /deny Everyone:(D)')  # Deny delete permissions
    print(f"Folder Locked '{folder_path}'")
    
lock_folder("D:\Sales")


  lock_folder("D:\Sales")


Folder Locked 'D:\Sales'


In [None]:
def unlock_folder(folder_path):
    os.system(f'icacls "{folder_path}" /remove:d Everyone')
    print(f"Folder Unlocked '{folder_path}'")

unlock_folder("D:\Sales")


  unlock_folder("D:\Sales")


Folder Unlocked 'D:\Sales'


## Read CSV with password

In [None]:
pip install pycryptodome

Defaulting to user installation because normal site-packages is not writeable
Collecting pycryptodome
  Downloading pycryptodome-3.22.0-cp37-abi3-win_amd64.whl.metadata (3.4 kB)
Downloading pycryptodome-3.22.0-cp37-abi3-win_amd64.whl (1.8 MB)
   ---------------------------------------- 0.0/1.8 MB ? eta -:--:--
   ----------------- ---------------------- 0.8/1.8 MB 5.7 MB/s eta 0:00:01
   ---------------------------------------- 1.8/1.8 MB 5.9 MB/s eta 0:00:00
Installing collected packages: pycryptodome
Successfully installed pycryptodome-3.22.0
Note: you may need to restart the kernel to use updated packages.


In [None]:
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad
import os

def encrypt_csv(input_csv, output_file, password):
    key = password.ljust(32)[:32].encode()  # Ensure 32-byte key for AES-256
    cipher = AES.new(key, AES.MODE_CBC)
    iv = cipher.iv  # Initialization Vector

    with open(input_csv, "rb") as f:
        plaintext = f.read()

    ciphertext = cipher.encrypt(pad(plaintext, AES.block_size))

    with open(output_file, "wb") as f:
        f.write(iv + ciphertext)  # Save IV + encrypted content

    print(f"CSV file '{input_csv}' encrypted successfully as '{output_file}'")

# Usage
encrypt_csv("D:\Sales\TestData.csv", "D:\Sales\data_encrypted.bin", "mypassword123")

CSV file 'D:\Sales\TestData.csv' encrypted successfully as 'D:\Sales\data_encrypted.bin'


  encrypt_csv("D:\Sales\TestData.csv", "D:\Sales\data_encrypted.bin", "mypassword123")
  encrypt_csv("D:\Sales\TestData.csv", "D:\Sales\data_encrypted.bin", "mypassword123")


In [None]:
from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad

def decrypt_csv(encrypted_file, output_csv, password):
    key = password.ljust(32)[:32].encode()

    with open(encrypted_file, "rb") as f:
        iv = f.read(16)  # Read IV
        ciphertext = f.read()

    cipher = AES.new(key, AES.MODE_CBC, iv)
    plaintext = unpad(cipher.decrypt(ciphertext), AES.block_size)

    with open(output_csv, "wb") as f:
        f.write(plaintext)

    print(f"Encrypted file '{encrypted_file}' decrypted successfully as '{output_csv}'")

# Usage
decrypt_csv("D:\Sales\data_encrypted.bin", "D:\Sales\data_decrypted.csv", "mypassword123")


Encrypted file 'D:\Sales\data_encrypted.bin' decrypted successfully as 'D:\Sales\data_decrypted.csv'


  decrypt_csv("D:\Sales\data_encrypted.bin", "D:\Sales\data_decrypted.csv", "mypassword123")
  decrypt_csv("D:\Sales\data_encrypted.bin", "D:\Sales\data_decrypted.csv", "mypassword123")


## Differential Privacy

In [3]:
pip install diffprivlib

Defaulting to user installation because normal site-packages is not writeable
Collecting diffprivlib
  Downloading diffprivlib-0.6.5-py3-none-any.whl.metadata (9.6 kB)
Downloading diffprivlib-0.6.5-py3-none-any.whl (176 kB)
Installing collected packages: diffprivlib
Successfully installed diffprivlib-0.6.5
Note: you may need to restart the kernel to use updated packages.


In [None]:
import pandas as pd
from diffprivlib.mechanisms import Laplace
import os

file_path = "D:\Sales\HeartAttackDetection.csv"

df = pd.read_csv(file_path)

print(df.head())

true_mean_age = df['Age'].mean()

epsilon = 0.5  
sensitivity = (df['Age'].max() - df['Age'].min()) / len(df)

laplace_mechanism = Laplace(epsilon=epsilon, sensitivity=sensitivity)
noisy_mean_age = laplace_mechanism.randomise(true_mean_age)

print(f"True Mean of Age: {true_mean_age}")
print(f"Noisy Mean of Age (Differentially Private): {noisy_mean_age}")


   Age  Sex  CP_Type  BloodPressure  Cholestrol  BloodSugar  ECG  \
0   63    1        3            145         233           1    0   
1   37    1        2            130         250           0    1   
2   41    0        1            130         204           0    0   
3   56    1        1            120         236           0    1   
4   57    0        0            120         354           0    1   

   MaxHeartRate  ExerciseAngina  FamilyHistory  Target  
0           150               0              2       1  
1           187               0              1       1  
2           172               0              0       1  
3           178               0              1       1  
4           163               1              0       1  
True Mean of Age: 54.366336633663366
Noisy Mean of Age (Differentially Private): 54.584582014320354


  file_path = "D:\Sales\HeartAttackDetection.csv"


## Column Anomaly

In [27]:
import pandas as pd
import numpy as np
from scipy import stats

file_path = "D:\Sales\HeartAttackDetection.csv"
# Load data into a DataFrame
df = pd.read_csv(file_path)

# 1. Check for missing values
missing_values = df.isnull().sum()
print(f"Missing Values:\n{missing_values}\n")

# 2. Detect Outliers using Z-score for numerical columns
numerical_columns = ['Age', 'BloodPressure', 'Cholestrol', 'MaxHeartRate']
for col in numerical_columns:
    z_scores = stats.zscore(df[col])
    outliers = np.where(np.abs(z_scores) > 2)  # Threshold of 2 for Z-score
    if len(outliers[0]) > 0:
        print(f"Outliers in {col}: {df[col].iloc[outliers]}")

# 3. Use IQR to detect outliers for numerical columns
for col in numerical_columns:
    Q1 = df[col].quantile(0.25)
    Q3 = df[col].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    outliers_iqr = df[(df[col] < lower_bound) | (df[col] > upper_bound)]
    if not outliers_iqr.empty:
        print(f"Outliers based on IQR in {col}: \n{outliers_iqr}\n")

# 4. Analyze categorical columns for unusual values
categorical_columns = ['Sex', 'CP_Type', 'BloodSugar', 'ECG', 'ExerciseAngina', 'FamilyHistory', 'Target']

for col in categorical_columns:
    unique_values = df[col].unique()
    print(f"Unique values in {col}: {unique_values}")

    # Check if any unexpected or anomalous categories exist
    if col == 'Sex' and not set(unique_values).issubset({0, 1}):
        print(f"Anomalous values in {col}: {unique_values}")
    
    if col == 'CP_Type' and not set(unique_values).issubset({1, 2, 3, 4}):
        print(f"Anomalous values in {col}: {unique_values}")
    
    if col == 'BloodSugar' and not set(unique_values).issubset({0, 1}):
        print(f"Anomalous values in {col}: {unique_values}")
    
    if col == 'ECG' and not set(unique_values).issubset({0, 1}):
        print(f"Anomalous values in {col}: {unique_values}")
    
    if col == 'ExerciseAngina' and not set(unique_values).issubset({0, 1}):
        print(f"Anomalous values in {col}: {unique_values}")
    
    if col == 'FamilyHistory' and not set(unique_values).issubset({0, 1, 2}):
        print(f"Anomalous values in {col}: {unique_values}")

# 5. Handling missing values: We can either drop or fill missing values
# Example of filling missing values with the median
df['Age'].fillna(df['Age'].median(), inplace=True)

print(f"\nData after filling missing values in 'Age':\n{df}\n")



  file_path = "D:\Sales\HeartAttackDetection.csv"


Missing Values:
Age               0
Sex               0
CP_Type           0
BloodPressure     0
Cholestrol        0
BloodSugar        0
ECG               0
MaxHeartRate      0
ExerciseAngina    0
FamilyHistory     0
Target            0
dtype: int64

Outliers in Age: 58     34
65     35
72     29
125    34
129    74
144    76
157    35
227    35
238    77
239    35
Name: Age, dtype: int64
Outliers in BloodPressure: 8      172
71      94
101    178
110    180
124     94
152    170
195    170
203    180
223    200
228    170
241    174
248    192
260    178
266    180
292    170
Name: BloodPressure, dtype: int64
Outliers in Cholestrol: 4      354
28     417
39     360
53     141
85     564
96     394
111    126
180    353
220    407
246    409
301    131
Name: Cholestrol, dtype: int64
Outliers in MaxHeartRate: 72     202
136     96
198     99
216     97
226    103
233     96
243     88
262     95
269    103
272     71
297     90
Name: MaxHeartRate, dtype: int64
Outliers based on IQR in Bl

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Age'].fillna(df['Age'].median(), inplace=True)


# Zip and unzip file or folder with password

In [22]:
pip install pyzipper


Defaulting to user installation because normal site-packages is not writeable
Collecting pyzipper
  Downloading pyzipper-0.3.6-py2.py3-none-any.whl.metadata (3.5 kB)
Collecting pycryptodomex (from pyzipper)
  Downloading pycryptodomex-3.22.0-cp37-abi3-win_amd64.whl.metadata (3.4 kB)
Downloading pyzipper-0.3.6-py2.py3-none-any.whl (67 kB)
Downloading pycryptodomex-3.22.0-cp37-abi3-win_amd64.whl (1.8 MB)
   ---------------------------------------- 0.0/1.8 MB ? eta -:--:--
   ---------------------------------- ----- 1.6/1.8 MB 12.7 MB/s eta 0:00:01
   ---------------------------------------- 1.8/1.8 MB 8.8 MB/s eta 0:00:00
Installing collected packages: pycryptodomex, pyzipper
Successfully installed pycryptodomex-3.22.0 pyzipper-0.3.6
Note: you may need to restart the kernel to use updated packages.


In [None]:
import os
import pyzipper

def zip_with_password(source_dir, output_zip, password):
    password_bytes = password.encode('utf-8')
    
    with pyzipper.AESZipFile(output_zip, 'w', encryption=pyzipper.WZ_AES) as zipf:
        zipf.setpassword(password_bytes)
        
        if os.path.isdir(source_dir):
            for root, dirs, files in os.walk(source_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    arcname = os.path.relpath(file_path, start=source_dir)
                    zipf.write(file_path, arcname=arcname)
        else:
            zipf.write(source_dir, os.path.basename(source_dir))
    
    print(f"Created password protected file: {output_zip}")

def unzip_with_password(zip_path, extract_path, password):
    password_bytes = password.encode('utf-8')
    
    try:
        os.makedirs(extract_path, exist_ok=True)
        
        with pyzipper.AESZipFile(zip_path) as zipf:
            zipf.setpassword(password_bytes)
            
            zipf.extractall(path=extract_path)
        
        print(f"Successfully extracted zip to: {extract_path}")
    
    except RuntimeError as e:
        print(f"Error extracting zip file: {e}")
        if "password" in str(e):
            print("Incorrect password or file is corrupted.")

# Example usage
zip_with_password(
    source_dir='D:/Sales/HeartAttackDetection.csv', 
    output_zip='D:/Target/zipwithpassword.zip', 
    password='Password123'
)

unzip_with_password(
    zip_path='D:/Target/zipwithpassword.zip', 
    extract_path='D:/Target', 
    password='Password123'
)


Created password-protected zip: D:/Target/zipwithpassword.zip
Successfully extracted zip to: D:/Target
