# No Man's Sky Updates Analysis

## Table of Contents
1. [Introduction](#Introduction)
2. [Data Description](#Data-Description)

## Introduction
The purpose of this analysis is to:
- Examine the frequency and patterns of updates for No Man's Sky.
- Provide insights into the development cycle.
- Help predict future updates.
- Identify periods of high and low activity for planning game events or releases.

## Data Description
The dataset includes:
- Dates and descriptions of each update.
- Data collected from official No Man's Sky update logs.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import numpy as np

# Load the update data for No Man's Sky
noman_update_data = pd.read_csv('../data/Noman/noman_updates_collection.csv')

# Display the first few rows of the data
print(noman_update_data.head())

  Category                                              Title  \
0    patch                 No Man's Sky Companions Patch 3.21   
1    patch  No Man's Sky Update 3.2 patch notes bring adop...   
2    patch                               Crossplay Patch 2.52   
3    patch                                Exo Mech Patch 2.42   
4    patch  No Man's Sky Beyond patch notes: VR, Multiplay...   

                  Date  
0  2021-02-19 14:45:50  
1  2021-02-17 07:40:00  
2  2020-06-12 16:25:20  
3  2020-04-16 13:50:00  
4  2019-08-15 15:57:34  


In [14]:
import re

# Convert the 'Date' column to datetime
noman_update_data['Date'] = pd.to_datetime(noman_update_data['Date'])

# Define a function to extract and format version numbers
def extract_version(title):
    match = re.search(r'(\d+\.\d+)', title)
    if match:
        version = match.group(0)
        major, minor = version.split('.')
        # Convert to three-part format
        return f"{major}.{minor[0]}.{minor[1:]}"
    return None

# Apply the function to extract and format versions
noman_update_data['Version'] = noman_update_data['Title'].apply(extract_version)

# Drop rows with no version information
noman_update_data = noman_update_data.dropna(subset=['Version']).copy()

# Extract major, minor, and patch version numbers
noman_update_data['Major_Version'] = noman_update_data['Version'].str.extract(r'(\d+\.\d+)').astype(float)
noman_update_data['Minor_Version'] = noman_update_data['Version'].str.extract(r'\d+\.(\d+)').astype(float)
noman_update_data['Patch_Version'] = noman_update_data['Version'].str.extract(r'\d+\.\d+\.(\d+)').astype(float)

# Create a combined version number
noman_update_data['Combined_Version'] = noman_update_data['Major_Version'] + noman_update_data['Minor_Version'] / 10 + noman_update_data['Patch_Version'] / 1000

# Sort the dataframe by date
noman_update_data = noman_update_data.sort_values('Date')

# Reset the index
noman_update_data = noman_update_data.reset_index(drop=True)

print(noman_update_data.head(100))

   Category                                              Title  \
0     patch                                   Patch Notes 1.07   
1     patch                                   Patch Notes 1.07   
2     patch                                         Patch 1.12   
3     patch                                         Patch 1.12   
4     patch                                         Patch 1.13   
5     patch                                         Patch 1.13   
6     patch                           Path Finder - Patch 1.22   
7     patch                           Path Finder - Patch 1.22   
8     patch                           Path Finder - Patch 1.23   
9     patch                           Path Finder - Patch 1.23   
10    patch                             Path Finder Patch 1.24   
11    patch                             Path Finder Patch 1.24   
12   update  No Man's Sky teases portals and spacecrafts ah...   
13   update              Update 1.3: Atlas Rises - Coming Soon   
14   updat

In [13]:
# Group by major version (first two parts of the version) and aggregate data
noman_grouped_df = noman_update_data.groupby('Major_Version').agg({
    'Date': ['min', 'max'],
    'Version': 'count'
}).reset_index()

# Rename columns
noman_grouped_df.columns = ['Major_Version', 'Start_Date', 'End_Date', 'Num_Versions']

print(noman_grouped_df.head)

    Major_Version          Start_Date            End_Date  Num_Versions
0             1.0 2016-09-02 17:25:14 2016-09-02 17:25:14             2
1             1.1 2016-12-07 10:54:35 2016-12-12 12:00:10             4
2             1.2 2017-03-13 16:53:02 2017-03-27 11:58:53             6
3             1.3 2017-08-08 15:00:13 2017-10-03 09:54:50            19
4             1.5 2018-07-26 18:47:35 2018-09-07 18:02:48             6
5             1.6 2018-09-20 14:13:31 2018-09-27 16:04:24             2
6             2.4 2020-04-16 13:50:00 2020-04-16 13:50:00             1
7             2.5 2020-06-12 16:25:20 2020-06-12 16:25:20             1
8             3.2 2021-02-17 07:40:00 2021-02-19 14:45:50             2
9             4.0 2022-10-03 13:00:10 2022-10-07 15:00:20             6
10            4.1 2023-02-22 17:02:40 2023-02-22 17:02:40             1
11            5.0 2024-07-17 13:00:00 2024-07-17 13:22:39             2
