# Web Server Log Analysis - Python Take-Home Assessment

## Overview
This assessment involves analyzing the Calgary HTTP dataset, which contains approximately one year's worth of HTTP requests to the University of Calgary's Computer Science web server. You'll work with real-world web server log data to extract meaningful insights and demonstrate your Python data analysis skills.

## Part 1: Data Loading and Cleaning

### Instructions

* Work in the cells below - You can add as many cells as needed for data loading, cleaning, and exploration
* Import required libraries
* Implement data loading and cleaning - Create functions to download, parse, and clean the log data
* Explore the data - Understand the structure and identify any data quality issues

In [1]:
# You can write your code here for data loading, cleaning, and exploration. Add cells as necessary.


import gzip
import urllib.request
import re
from datetime import datetime
import pandas as pd

# 1) Download & Read
url = "ftp://ita.ee.lbl.gov/traces/calgary_access_log.gz"
local_file = "calgary_access_log.gz"
urllib.request.urlretrieve(url, local_file)

lines = []
with gzip.open(local_file, 'rt', errors='ignore') as f:
    for line in f:
        lines.append(line.strip())


 2. Parse & Clean Entries

In [2]:
import re
from datetime import datetime

# Regex for Apache Common Log Format
LOG_RE = re.compile(
    r'(?P<remotehost>\S+) (\S+) (\S+) \[(?P<dt>[^\]]+)\] '
    r'"(?P<request>[^"]+)" (?P<status>\d{3}) (?P<bytes>\S+)'
)

def parse_line(line: str):
    m = LOG_RE.match(line)
    if not m:
        return None  # malformed
    d = m.groupdict()
    
    # 1) Parse timestamp → datetime
    try:
        d['datetime'] = datetime.strptime(d['dt'], "%d/%b/%Y:%H:%M:%S %z")
    except Exception:
        return None
    
    # 2) Split request into method, resource, protocol
    parts = d['request'].split()
    if len(parts) != 3:
        return None
    d['method'], resource, d['protocol'] = parts
    d['filename'] = resource.split("/")[-1] if "/" in resource else resource
    
    # 3) Convert numeric fields
    d['status'] = int(d['status'])
    d['bytes']  = int(d['bytes']) if d['bytes'].isdigit() else 0
    
    # 4) Extract extension
    d['extension'] = d['filename'].split(".")[-1] if "." in d['filename'] else ""
    
    return d

# Build parsed list, filtering out None
parsed_logs = [rec for rec in (parse_line(l) for l in lines) if rec]
print(f"Valid entries after parsing: {len(parsed_logs)}")


Valid entries after parsing: 722270


In [5]:
# Force-cast to datetime64
df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce')

# Now you can use .dt
df['date_str'] = df['datetime'].dt.strftime('%d-%b-%Y')
df['hour']     = df['datetime'].dt.hour

print(df.dtypes)
print(df[['datetime','date_str','hour']].head())


remotehost                       object
dt                               object
request                          object
status                            int64
bytes                             int64
datetime      datetime64[ns, UTC-06:00]
method                           object
protocol                         object
filename                         object
extension                        object
date_str                         object
hour                              int32
dtype: object
                   datetime     date_str  hour
0 1994-10-24 13:41:41-06:00  24-Oct-1994    13
1 1994-10-24 13:41:41-06:00  24-Oct-1994    13
2 1994-10-24 13:43:13-06:00  24-Oct-1994    13
3 1994-10-24 13:43:14-06:00  24-Oct-1994    13
4 1994-10-24 13:43:15-06:00  24-Oct-1994    13


## Convert to DataFrame & Final Clean‑up

In [4]:
import pandas as pd

df = pd.DataFrame(parsed_logs)

# Ensure datetime dtype
df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce')

# Flag or drop any remaining NaT
print("NaT count in datetime:", df['datetime'].isna().sum())
df = df.dropna(subset=['datetime'])

# Add helper columns
df['date_str'] = df['datetime'].dt.strftime('%d-%b-%Y')
df['hour']     = df['datetime'].dt.hour

# Final look
print(df.dtypes)
df.head()


NaT count in datetime: 290436
remotehost                       object
dt                               object
request                          object
status                            int64
bytes                             int64
datetime      datetime64[ns, UTC-06:00]
method                           object
protocol                         object
filename                         object
extension                        object
date_str                         object
hour                              int32
dtype: object


Unnamed: 0,remotehost,dt,request,status,bytes,datetime,method,protocol,filename,extension,date_str,hour
0,local,24/Oct/1994:13:41:41 -0600,GET index.html HTTP/1.0,200,150,1994-10-24 13:41:41-06:00,GET,HTTP/1.0,index.html,html,24-Oct-1994,13
1,local,24/Oct/1994:13:41:41 -0600,GET 1.gif HTTP/1.0,200,1210,1994-10-24 13:41:41-06:00,GET,HTTP/1.0,1.gif,gif,24-Oct-1994,13
2,local,24/Oct/1994:13:43:13 -0600,GET index.html HTTP/1.0,200,3185,1994-10-24 13:43:13-06:00,GET,HTTP/1.0,index.html,html,24-Oct-1994,13
3,local,24/Oct/1994:13:43:14 -0600,GET 2.gif HTTP/1.0,200,2555,1994-10-24 13:43:14-06:00,GET,HTTP/1.0,2.gif,gif,24-Oct-1994,13
4,local,24/Oct/1994:13:43:15 -0600,GET 3.gif HTTP/1.0,200,36403,1994-10-24 13:43:15-06:00,GET,HTTP/1.0,3.gif,gif,24-Oct-1994,13


## ⚠️ IMPORTANT: Template Questions Section
**DO NOT MODIFY THE TEMPLATE BELOW THIS POINT**

The following section contains the assessment questions. You may add cells above this section for data loading, cleaning, and exploration, but do not modify the function signatures or structure of the questions below.

## Part 2: Analysis Questions

### Instructions

* Implement each function according to its docstring specifications
* Use the cleaned data you prepared in Part 1
* Ensure your functions return the exact data types specified
* Test your functions to verify they work correctly
* You may add helper functions, but keep the main function signatures unchanged

### Q1: Count of total log records

Q1: Count of total log records

○ Description: Count the total number of HTTP requests in the log file.

Each line represents one log entry.

○ Return Type: int

○ Example: 123456

In [6]:
import gzip

def total_log_records(filepath: str = "calgary_access_log.gz") -> int:
    """
    Q1: Count of total log records.

    Objective:
        Determine the total number of HTTP log entries in the dataset.
        Each line in the log file represents one HTTP request.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        int: Total number of log entries.
    """
    count = 0
    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for _ in f:
                count += 1
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error reading log file: {e}")
    return count

# Usage
answer1 = total_log_records()
print("Answer 1:")
print(answer1)


Answer 1:
726739


### Q2: Count of unique hosts

Q2: Count of unique hosts

○ Description: Determine the number of distinct hosts (IP addresses or
domain names) that accessed the server.

○ Return Type: int

○ Example: 8567

In [7]:
import gzip

def unique_host_count(filepath: str = "calgary_access_log.gz") -> int:
    """
    Q2: Count of unique hosts.

    Objective:
        Determine how many distinct hosts accessed the server.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        int: Number of unique hosts.
    """
    hosts = set()
    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                parts = line.split()
                if parts:
                    hosts.add(parts[0])
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error reading log file: {e}")
    return len(hosts)

# Usage
answer2 = unique_host_count()
print("Answer 2:")
print(answer2)


Answer 2:
2


### Q3: Date-wise unique filename counts

Date-wise unique filename counts

○ Description: For each date, count how many unique filenames were
requested.

○ Return Type: dict[str, int]

○ Format: { '01-Jul-1995': 123, '02-Jul-1995': 150 }

○ Note: Date format must be 'dd-MMM-yyyy'

In [8]:
import gzip
import re
from datetime import datetime
from collections import defaultdict

def datewise_unique_filename_counts(filepath: str = "calgary_access_log.gz") -> dict[str, int]:
    """
    Q3: Date-wise unique filename counts.

    Objective:
        For each date, count the number of unique filenames that accessed the server.
        The date should be in 'dd-MMM-yyyy' format (e.g., '01-Jul-1995').

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        dict: A dictionary mapping each date to its count of unique filenames.
              Example: {'01-Jul-1995': 123, '02-Jul-1995': 150}
    """
    # Regex to pull out the timestamp and request
    log_re = re.compile(
        r'\S+ \S+ \S+ \[(?P<dt>[^\]]+)\] "(?P<req>[^"]+)" \d{3} \S+'
    )
    
    # Map from date_str to set of filenames
    unique_files_per_date: dict[str, set[str]] = defaultdict(set)

    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                m = log_re.match(line)
                if not m:
                    continue
                
                # Parse timestamp
                dt_str = m.group("dt")  # e.g. '24/Oct/1994:14:23:15 -0700'
                try:
                    dt = datetime.strptime(dt_str, "%d/%b/%Y:%H:%M:%S %z")
                except ValueError:
                    continue
                
                date_key = dt.strftime("%d-%b-%Y")  # '24-Oct-1994'
                
                # Extract filename from request
                req_parts = m.group("req").split()
                if len(req_parts) != 3:
                    continue
                resource = req_parts[1]               # e.g. '/index.html'
                filename = resource.split("/")[-1]    # e.g. 'index.html'
                if filename:
                    unique_files_per_date[date_key].add(filename)
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error processing log file: {e}")

    # Convert sets to counts
    return {date: len(files) for date, files in unique_files_per_date.items()}


# Usage
answer3 = datewise_unique_filename_counts()
print("Answer 3:")
print(answer3)


Answer 3:
{'24-Oct-1994': 228, '25-Oct-1994': 319, '26-Oct-1994': 377, '27-Oct-1994': 384, '28-Oct-1994': 399, '29-Oct-1994': 254, '30-Oct-1994': 236, '31-Oct-1994': 361, '01-Nov-1994': 412, '02-Nov-1994': 427, '03-Nov-1994': 459, '04-Nov-1994': 402, '05-Nov-1994': 193, '06-Nov-1994': 219, '07-Nov-1994': 364, '08-Nov-1994': 266, '09-Nov-1994': 335, '10-Nov-1994': 356, '11-Nov-1994': 297, '12-Nov-1994': 173, '13-Nov-1994': 186, '14-Nov-1994': 329, '15-Nov-1994': 324, '16-Nov-1994': 391, '17-Nov-1994': 440, '18-Nov-1994': 403, '19-Nov-1994': 195, '20-Nov-1994': 263, '21-Nov-1994': 335, '22-Nov-1994': 351, '23-Nov-1994': 322, '24-Nov-1994': 365, '25-Nov-1994': 323, '26-Nov-1994': 221, '27-Nov-1994': 187, '28-Nov-1994': 341, '29-Nov-1994': 448, '30-Nov-1994': 354, '01-Dec-1994': 271, '02-Dec-1994': 323, '03-Dec-1994': 189, '04-Dec-1994': 212, '05-Dec-1994': 351, '06-Dec-1994': 297, '07-Dec-1994': 383, '08-Dec-1994': 346, '09-Dec-1994': 372, '10-Dec-1994': 150, '11-Dec-1994': 202, '12-Dec-1

### Q4: Number of 404 response codes

Q4: Number of 404 response codes

○ Description: Count how many HTTP requests resulted in a 404 (Not
Found) response.

○ Return Type: int

○ Example: 3490

In [9]:
import gzip

def count_404_errors(filepath: str = "calgary_access_log.gz") -> int:
    """
    Q4: Number of 404 response codes.

    Objective:
        Count how many times the HTTP 404 Not Found status appears in the logs.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        int: Number of 404 errors.
    """
    count = 0
    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                parts = line.split()
                # In Apache Common Log Format, status is the second-to-last field
                if len(parts) >= 2 and parts[-2] == "404":
                    count += 1
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error reading log file: {e}")
    return count

# Usage
answer4 = count_404_errors()
print("Answer 4:")
print(answer4)


Answer 4:
23602


### Q5: Top 15 filenames with 404 responses

Q5: Top 15 filenames with 404 responses

○ Description: Find the 15 most requested URLs that resulted in a 404
error, sorted by frequency.

○ Return Type: list[tuple[str, int]]

○ Format: [('missing.html', 200), ('notfound.gif', 123), ...]

In [10]:
import gzip
import re
from collections import Counter

def top_15_filenames_with_404(filepath: str = "calgary_access_log.gz") -> list[tuple[str, int]]:
    """
    Q5: Top 15 filenames with 404 responses.

    Objective:
        Identify which requested URLs most frequently resulted in a 404 error.
        Return the top 15 filenames sorted by frequency.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        list: A list of tuples (filename, count), sorted by count in descending order.
              Example: [('missing.html', 200), ...]
    """
    # Regex for parsing lines
    log_re = re.compile(
        r'\S+ \S+ \S+ \[[^\]]+\] "(?P<req>[^"]+)" (?P<status>\d{3}) \S+'
    )
    
    counter = Counter()
    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                m = log_re.match(line)
                if not m:
                    continue
                status = int(m.group("status"))
                if status != 404:
                    continue
                
                # Extract resource from request
                parts = m.group("req").split()
                if len(parts) != 3:
                    continue
                resource = parts[1]               # e.g. '/path/to/file.html'
                filename = resource.split("/")[-1] or resource
                counter[filename] += 1
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error processing log file: {e}")

    return counter.most_common(15)


# Usage
answer5 = top_15_filenames_with_404()
print("Answer 5:")
print(answer5)


Answer 5:
[('index.html', 4694), ('4115.html', 902), ('1611.html', 649), ('5698.xbm', 585), ('710.txt', 408), ('2002.html', 258), ('2177.gif', 193), ('10695.ps', 161), ('6555.html', 153), ('487.gif', 152), ('151.html', 149), ('40.html', 148), ('488.gif', 148), ('3414.gif', 148), ('9678.gif', 142)]


### Q6: Top 15 file extension with 404 responses

Top 15 file extensions with 404 responses

○ Description: Identify the file extensions (like .html, .jpg) that caused the
most 404 errors.

○ Return Type: list[tuple[str, int]]

○ Format: [('html', 345), ('gif', 220), ...]

In [11]:
import gzip
import re
from collections import Counter

def top_15_ext_with_404(filepath: str = "calgary_access_log.gz") -> list[tuple[str, int]]:
    """
    Q6: Top 15 file extensions with 404 responses.

    Objective:
        Find which file extensions generated the most 404 errors.
        Return the top 15 sorted by number of 404s.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        list: A list of tuples (extension, count), sorted by count in descending order.
              Example: [('html', 45), ...]
    """
    # Regex to parse the request line and status
    log_re = re.compile(
        r'\S+ \S+ \S+ \[[^\]]+\] "(?P<req>[^"]+)" (?P<status>\d{3}) \S+'
    )

    ext_counter = Counter()
    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                m = log_re.match(line)
                if not m:
                    continue
                status = int(m.group("status"))
                if status != 404:
                    continue

                # Extract the requested resource path
                parts = m.group("req").split()
                if len(parts) != 3:
                    continue
                resource = parts[1]  # e.g. '/path/to/file.html'
                filename = resource.split("/")[-1]

                # Get extension (text after last dot), or empty string
                if "." in filename:
                    ext = filename.rsplit(".", 1)[1].lower()
                else:
                    ext = ""

                ext_counter[ext] += 1
    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error processing log file: {e}")

    # Return the 15 most common extensions
    return ext_counter.most_common(15)


# Usage
answer6 = top_15_ext_with_404()
print("Answer 6:")
print(answer6)


Answer 6:
[('html', 12145), ('gif', 7337), ('xbm', 824), ('ps', 754), ('jpg', 531), ('txt', 508), ('', 337), ('htm', 108), ('cgi', 77), ('com', 45), ('z', 41), ('dvi', 40), ('ca', 36), ('hmtl', 30), ('util', 29)]


### Q7: Total bandwidth transferred per day for the month of July 1995

● Q7: Total bandwidth transferred per day for July 1995

○ Description: Sum the bytes transferred per day for July 1995 (exclude
missing or '-' byte values).

○ Return Type: dict[str, int]

In [12]:
import gzip
import re
from datetime import datetime
from collections import defaultdict

def total_bandwidth_per_day(filepath: str = "calgary_access_log.gz") -> dict[str, int]:
    """
    Q7: Total bandwidth transferred per day for the month of July 1995.

    Objective:
        Sum the number of bytes transferred per day.
        Skip entries where the byte field is missing or '-'.

    Args:
        filepath (str): Path to the .gz log file.

    Returns:
        dict: A dictionary mapping each date ('dd-MMM-yyyy') to total bytes transferred.
              Example: {'01-Jul-1995': 123456789, ...}
    """
    # Regex to parse timestamp and bytes
    log_re = re.compile(
        r'\S+ \S+ \S+ \[(?P<dt>[^\]]+)\] "[^"]+" \d{3} (?P<bytes>\S+)'
    )

    # Accumulator for each date
    bandwidth_per_day = defaultdict(int)

    try:
        with gzip.open(filepath, "rt", errors="ignore") as f:
            for line in f:
                m = log_re.match(line)
                if not m:
                    continue

                # Parse datetime
                dt_str = m.group("dt")  # e.g. '24/Oct/1994:14:23:15 -0700'
                try:
                    dt = datetime.strptime(dt_str, "%d/%b/%Y:%H:%M:%S %z")
                except ValueError:
                    continue

                # Only July 1995
                if dt.year != 1995 or dt.month != 7:
                    continue

                # Parse bytes field
                byte_str = m.group("bytes")
                if byte_str.isdigit():
                    bandwidth_per_day[dt.strftime("%d-%b-%Y")] += int(byte_str)
                # skip '-' or malformed

    except FileNotFoundError:
        raise FileNotFoundError(f"Log file not found: {filepath}")
    except Exception as e:
        raise RuntimeError(f"Error processing log file: {e}")

    return dict(bandwidth_per_day)

# Usage
answer7 = total_bandwidth_per_day()
print("Answer 7:")
print(answer7)


Answer 7:
{'01-Jul-1995': 11349799, '02-Jul-1995': 8656918, '03-Jul-1995': 13596612, '04-Jul-1995': 26573988, '05-Jul-1995': 19541225, '06-Jul-1995': 19755015, '07-Jul-1995': 9427822, '08-Jul-1995': 5403491, '09-Jul-1995': 4660556, '10-Jul-1995': 14917754, '11-Jul-1995': 22507207, '12-Jul-1995': 17367065, '13-Jul-1995': 15989234, '14-Jul-1995': 19186430, '15-Jul-1995': 15773233, '16-Jul-1995': 9016378, '17-Jul-1995': 19601338, '18-Jul-1995': 17099761, '19-Jul-1995': 17851725, '20-Jul-1995': 20752623, '21-Jul-1995': 25491617, '22-Jul-1995': 8136259, '23-Jul-1995': 9593870, '24-Jul-1995': 22308265, '25-Jul-1995': 24561635, '26-Jul-1995': 24995540, '27-Jul-1995': 25969995, '28-Jul-1995': 36460693, '29-Jul-1995': 11700624, '30-Jul-1995': 23189598, '31-Jul-1995': 30730715}


### Q8: Hourly request distribution

Q8: Hourly request distribution

○ Description: Count how many HTTP requests occurred during each
hour (0–23).

○ Return Type: dict[int, int]

○ Format: { 0: 1200, 1: 900, ..., 23: 670 }

In [13]:
# %%
import gzip
import re
from collections import defaultdict
from datetime import datetime

LOG_RE = re.compile(
    r'(?P<host>\S+) \S+ \S+ \[(?P<dt>[^\]]+)\] '
    r'"(?P<req>[^"]+)" (?P<status>\d{3}) (?P<bytes>\S+)'
)

def hourly_request_distribution(filepath: str = "calgary_access_log.gz") -> dict[int, int]:
    counts = defaultdict(int)
    with gzip.open(filepath, "rt", errors="ignore") as f:
        for line in f:
            m = LOG_RE.match(line)
            if not m: 
                continue
            try:
                dt = datetime.strptime(m.group("dt"), "%d/%b/%Y:%H:%M:%S %z")
            except ValueError:
                continue
            counts[dt.hour] += 1
    return dict(counts)

# Run and display
answer8 = hourly_request_distribution()
print("Hourly distribution (0–23):", answer8)


Hourly distribution (0–23): {13: 51457, 14: 54562, 15: 50377, 16: 51176, 17: 45060, 18: 33222, 19: 30573, 20: 29691, 21: 27405, 22: 23827, 23: 21883, 0: 18764, 1: 14389, 3: 10901, 4: 9969, 5: 10804, 6: 13059, 7: 16672, 8: 26591, 9: 33987, 10: 43371, 11: 47588, 12: 46814, 2: 12694}


### Q9: Top 10 most requested filenames

Q9: Top 10 most requested filenames

○ Description: Identify the top 10 most frequently requested filenames,
regardless of status code.

○ Return Type: list[tuple[str, int]]

○ Format: [('index.html', 5678), ('home.gif', 4321), ...]

In [14]:
# %%
from collections import Counter

def top_10_most_requested_filenames(filepath: str = "calgary_access_log.gz") -> list[tuple[str, int]]:
    counter = Counter()
    with gzip.open(filepath, "rt", errors="ignore") as f:
        for line in f:
            m = LOG_RE.match(line)
            if not m:
                continue
            parts = m.group("req").split()
            if len(parts) != 3:
                continue
            resource = parts[1]
            filename = resource.split("/")[-1] or resource
            counter[filename] += 1
    return counter.most_common(10)

# Run and display
answer9 = top_10_most_requested_filenames()
print("Top 10 requested filenames:", answer9)


Top 10 requested filenames: [('index.html', 139528), ('3.gif', 24006), ('2.gif', 23595), ('4.gif', 8018), ('244.gif', 5148), ('5.html', 5010), ('4097.gif', 4874), ('8870.jpg', 4492), ('6733.gif', 4278), ('8472.gif', 3843)]


### Q10: HTTP response code distribution

Q10: HTTP response code distribution

○ Description: Count the occurrences of each HTTP response status code
(e.g., 200, 404).

○ Return Type: dict[int, int]

○ Format: { 200: 150000, 404: 3200, 500: 87 }

In [15]:
# %%
from collections import Counter

def response_code_distribution(filepath: str = "calgary_access_log.gz") -> dict[int, int]:
    counter = Counter()
    with gzip.open(filepath, "rt", errors="ignore") as f:
        for line in f:
            parts = line.rsplit(maxsplit=2)
            if len(parts) < 2:
                continue
            status_str = parts[-2]
            if status_str.isdigit():
                counter[int(status_str)] += 1
    return dict(counter)

# Run and display
answer10 = response_code_distribution()
print("Status code distribution:", answer10)


Status code distribution: {200: 568502, 302: 30325, 304: 97792, 404: 23602, 780: 1, 403: 4743, 501: 43, 400: 19, 329: 1, 579: 1, 0: 39, 500: 42, 55124: 1, 506: 1, 9154: 1, 1619: 1, 732: 1, 3207: 1, 8192: 3, 530: 1, 2227: 1, 26048: 1, 1268: 4, 1191: 1, 14374: 1, 899: 1, 2060: 1, 2595: 1, 2881: 5, 2711: 1, 2658: 1, 479: 2, 2830: 1, 887: 3, 768: 1, 1791: 1, 1062: 1, 1377: 1, 527: 1, 4627: 1, 1831: 1, 1018: 1, 1294: 1, 1527: 1, 1926: 1, 1363: 1, 688: 1, 2632: 1, 1113: 1, 489: 1, 288: 1, 2587: 1, 1692: 1, 2574: 1, 2420: 1, 2014: 1, 2344: 1, 11079: 1, 755: 1, 1524: 1, 224: 1, 1611: 1, 2250: 1, 3067: 1, 378: 1, 2972: 1, 1819: 1, 968: 1, 554: 1, 2019: 1, 21569: 1, 8046: 1, 252: 2, 5058: 1, 2470: 1, 1179: 1, 999: 1, 1567: 2, 1267: 1, 10746: 2, 401: 46, 2544: 1, 4878: 1, 10853: 1, 839: 1, 2247: 1, 10814: 2, 7708: 1, 10928: 1, 1682: 1, 19696: 1, 718: 1, 5409: 1, 12053: 1, 2786: 1, 7259: 2, 4308: 1, 1629: 2, 1729: 2, 3818: 1, 1046: 1, 3020: 1, 1596: 1, 322: 1}
