<a href="https://colab.research.google.com/github/DartDoesData/python-practice/blob/main/Week_4_Day_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# 🛡️ **Week 4, Day 4: More on Python for Cybersecurity**



# 🔐 Pwned Password Checker: Enhancing Password Security

### 🕵️‍♂️ **What Does "Pwned" Mean?**
**"Pwned"** (pronounced like "owned") is a slang term used in gaming and cybersecurity. It means that someone or something has been **defeated, compromised, or taken control of**. The term comes from a common typing error of the word "owned." In cybersecurity, if your account or password is **"pwned,"** it means it has been exposed in a data breach.

### 🛡️ **Overview of the Exercise**
In this activity, we'll use the **Have I Been Pwned (HIBP)** API to check if passwords have been compromised. You’ll learn how to use API requests and understand the importance of secure passwords. This exercise will help you identify weak or compromised passwords and improve your cybersecurity awareness.

- **Have I Been Pwned (HIBP)**: [HIBP Website](https://haveibeenpwned.com/API/v3)
- **Get an API Key** (optional, this costs money): [Get API Key](https://haveibeenpwned.com/API/Key)

### 📚 **Introduction to Hashing**

1. **What Is a Hash?**
   - A **hash** is a unique code generated from a piece of information (like a password). It looks like a random string of letters and numbers. Hashing is used to store passwords securely without keeping the actual password.
   - Example: If your password is `"password123"`, its SHA-1 hash would look something like:
     ```
     CBFDAC6008F9CAB4083784CBD1874F76618D2A97
     ```

2. **Why Use Only the First 5 Characters of the Hash?**
   - Instead of sending the entire hash to the API, we only send the **first 5 characters** (called the **prefix**). This method helps protect your password because the full hash is never shared.
   - The remaining characters (called the **suffix**) are kept private and checked locally on your computer. This approach is known as **k-anonymity** and helps ensure your data privacy.

3. **How Does the Password Check Work?**
   - We split the hash into two parts:
     - **Prefix**: The first 5 characters, which are sent to the API.
     - **Suffix**: The remaining characters, kept locally for checking.
   - The API responds with a list of possible hash matches that start with the prefix.
   - We then compare the suffix locally to see if the full hash matches any in the list. This way, we can determine if a password is compromised without revealing it.


In [None]:
from pprint import pprint

passwords = [{
  'last_name':'Williams',
  'first_name':'Dartanion',
  'password':'test123'
},
{
  'last_name':'Swift',
  'first_name':'Deangelo',
  'password':'Pl3@$3-Pr0t3ct-My_Cr3d$!!'
},
{
  'last_name':'Swift',
  'first_name':'Dartdart',
  'password':'Chicago1!'
},
{
  'last_name':'Swift',
  'first_name':'Dartie',
  'password':'Password1'
},
{
  'last_name':'TheGreat',
  'first_name':'Dartanion',
  'password':'Th3Qu!cKBr0wnF0x#711!'
}]

pprint(passwords)

In [None]:
import pandas as pd
import hashlib
import requests

accounts_df = pd.DataFrame(passwords)

# Define a function to check if a password is compromised
def check_password(password):
    # Hash the password using SHA-1
    sha1_hash = hashlib.sha1(password.encode()).hexdigest().upper()

    # Get the first 5 characters of the hash (prefix)
    hash_prefix = sha1_hash[:5]
    hash_suffix = sha1_hash[5:]

    # Query the HIBP API with the hash prefix
    url = f"https://api.pwnedpasswords.com/range/{hash_prefix}"
    response = requests.get(url)

    # Check if the request was successful
    if response.status_code != 200:
        print(f"Error: Unable to connect to HIBP API (Status Code: {response.status_code})")
        return False

    # Parse the response and check for the hash suffix
    hashes = (line.split(':') for line in response.text.splitlines())
    for suffix, count in hashes:
        if suffix == hash_suffix:
            return True  # Password is compromised

    return False  # Password is not compromised

# Create a new column to store the results
accounts_df['compromised'] = accounts_df['password'].apply(check_password)

# Filter the DataFrame for compromised passwords
compromised_users = accounts_df[accounts_df['compromised']]
uncompromised_users = accounts_df[~accounts_df['compromised']]


# Display the compromised users

display('Compromised users', compromised_users)
display('Uncompromised users', uncompromised_users)

### Practice activity 1
In this activity, you will build a simple Python script to check if a password has been compromised using the **Have I Been Pwned (HIBP) Pwned Passwords API**. This will help you understand how to use API requests and learn about basic cybersecurity practices.

### 📝 **Instructions**:

1. **Understand the Scenario**:
   - You work on the cybersecurity team, and your task is to help users ensure their passwords are safe.
   - You’ll create a script that takes a password as input, checks if it has been compromised in a known data breach, and alerts the user if they should change it.

2. **Exercise steps**:

   - **Step 1: Import Libraries**  
     Start by importing the necessary Python libraries.

     ```python
     import hashlib
     import requests
     ```

   - **Step 2: Get User Input**  
     Prompt the user to enter a password to check if it’s been compromised.

     ```python
     password = input("Enter a password to check if it has been compromised: ")
     ```

   - **Step 3: Hash the Password Using SHA-1**  
     Use the SHA-1 hashing algorithm to generate a hash of the password. This helps keep the password secure during the check.

     ```python
     sha1_hash = hashlib.sha1(password.encode()).hexdigest().upper()
     ```

   - **Step 4: Prepare the API Request**  
     Split the hash into a **prefix** (first 5 characters) and a **suffix** (remaining characters). This allows you to use the k-anonymity model to protect the full password hash.

     ```python
     hash_prefix = sha1_hash[:5]
     hash_suffix = sha1_hash[5:]
     ```

   - **Step 5: Make the API Request**  
     Use the HIBP API to check if the hash prefix appears in the breached passwords database.

     ```python
     url = f"https://api.pwnedpasswords.com/range/{hash_prefix}"
     response = requests.get(url)
     ```

   - **Step 6: Check the API Response**  
     Ensure the request was successful. If not, print an error message.

     ```python
     if response.status_code != 200:
         print(f"Error: Unable to connect to HIBP API (Status Code: {response.status_code})")
     ```

   - **Step 7: Parse the Response**  
     The API returns a list of hash suffixes and the number of times each was found. Check if the hash suffix of your password is in the returned list.

     ```python
     hashes = (line.split(':') for line in response.text.splitlines())
     found = any(suffix == hash_suffix for suffix, count in hashes)
     ```

   - **Step 8: Display the Result**  
     If the hash suffix is found, alert the user that their password has been compromised. Otherwise, inform them that it’s safe.

     ```python
     if found:
         print("⚠️ This password has been compromised! Choose a different one.")
     else:
         print("✅ This password is safe and not found in the compromised list.")
     ```

4. **Testing Your Script**:
   - Try running the script with sample passwords like `"password123"`, `"qwerty2023"`, or `"securepass1"`.
   - Observe the output and determine if the password was found in the breached dataset.

In [None]:
import hashlib
import requests

# Prompt the user for a password to check
password = # YOUR CODE HERE

# Hash the password using SHA-1
sha1_hash = hashlib.sha1(password.encode()).hexdigest().upper()

# Get the first 5 characters of the hash (prefix)
hash_prefix = sha1_hash[:5]
hash_suffix = sha1_hash[5:]

# Query the HIBP API with the hash prefix
url = f"https://api.pwnedpasswords.com/range/{hash_prefix}"
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the response text (hash suffixes and counts)
    hashes = # YOUR CODE HERE

    # Check if the hash suffix matches any returned from the API
    found = # YOUR CODE HERE

    # Conditional statement to check if the password was found or not
else:
    print(f"Error: Unable to connect to HIBP API (Status Code: {response.status_code})")

# 📢 Google Safe Browsing API

The Google Safe Browsing API can be used to check URLs for potential threats like phishing, malware, and unwanted software. This API helps enhance cybersecurity by identifying unsafe websites and protecting users from online threats.

### 🔍 What Is the Google Safe Browsing API?
The Google Safe Browsing API is a service provided by Google that checks URLs against a constantly updated list of unsafe websites. It’s widely used by web browsers, security software, and apps to warn users before they visit a harmful site.

### 🗝️ Getting Started with the API Key:
To use the Google Safe Browsing API, you need an API key. Follow these steps to get your key:
1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
2. Create a new project (or use an existing one).
3. Enable the **Google Safe Browsing API** from the API Library.
4. Go to **APIs & Services > Credentials** and click on **Create API Key**.
5. Copy the generated API key and set it up as a `SAFE_BROWSING_API_KEY` secret.

In [None]:
from google.colab import userdata
SAFE_BROWSING_API_KEY = userdata.get('SAFE_BROWSING_API_KEY')

In [None]:
import requests
import json

# Define the API endpoint
api_url = "https://safebrowsing.googleapis.com/v4/threatMatches:find"

# Sample URL list
urls_to_check = [
    "http://malware.wicar.org/data/ms14_064_ole_not_xp.html",
    "https://www.google.com",
    "http://safesite.com",
    "http://phishingsite.com/login",
    "http://testsafebrowsing.appspot.com/s/malware.html",
    "http://testsafebrowsing.appspot.com/s/phishing.html"
]

# Function to check a URL using Google Safe Browsing API
def check_url(api_key, url_to_check):
    # Prepare the request payload
    payload = {
        "client": {
            "clientId": "test-client",
            "clientVersion": "1.0"
        },
        "threatInfo": {
            "threatTypes": ["MALWARE", "SOCIAL_ENGINEERING", "UNWANTED_SOFTWARE"],
            "platformTypes": ["ANY_PLATFORM"],
            "threatEntryTypes": ["URL"],
            "threatEntries": [{"url": url_to_check}]
        }
    }

    # Full request URL with API key
    full_url = f"{api_url}?key={api_key}"

    # Make the POST request
    try:
        response = requests.post(full_url, json=payload)

        # Debugging: Print response status code and content
        print(f"Status Code: {response.status_code}")
        print(f"Response Content: {response.text}")

        # Check if the request was successful
        if response.status_code == 200:
            # Parse the JSON response
            result = response.json()
            if "matches" in result:
                print(f"⚠️ Warning: The URL '{url_to_check}' is flagged as potentially unsafe.")
            else:
                print(f"✅ The URL '{url_to_check}' is safe.")
        else:
            print(f"Error: Unable to connect to the API (Status Code: {response.status_code})")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

# Check each URL in the list
for url in urls_to_check:
    check_url(SAFE_BROWSING_API_KEY, url)