## Phase I Project Proposal
### What Makes Virtual Reality Popular?

#### Name: Miles Gallagher, DS 3000

### Introduction
Virtual Reality has been rapidly revolutionizing gaming by providing the most immersive experiences, allowing users to engage inside of digital environments in a deeply interactive way. Despite VR growing in popularity, developers need a clear understanding of what's driving engagement to create a more compelling audience. For the project I want to focus on identifying the key factors which affect user engagement in VR games and answering these questions. 
1. What are the most popular VR games on Steam?
2. How do factors such as price, genre, and reviews influence the success of VR games?
https://www.grandviewresearch.com/industry-analysis/virtual-reality-vr-headset-market
https://www.cnet.com/pictures/best-vr-games/


### Data Collection

The data for this project is collected directly from Steam’s website using web scraping. Steam is the largest digital distribution platform for PC gaming. Then I will use BeautifulSoup to extract additional information from specifically data-tooltip-html into Python.  So far I have created a function which gathers some of the top rated VR games. Later, I will create another function which will go in depth on number of reviews, price, and genre. 

In [102]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

def fetch_popular_vr_games(limit=5):
    """
    Fetches popular VR games from Steam, including their app IDs, prices, number of reviews, positive review percentages, 
    and supported operating systems (Windows, macOS, Linux), and VR support.

    Args:
        limit (int): Number of games to retrieve.

    Returns:
        pd.DataFrame: DataFrame containing popular VR games and information.
    """
    url = "https://store.steampowered.com/search/?sort_by=_ASC&ignore_preferences=1&tags=21978&category1=998&vrsupport=101&os=win&supportedlang=english"
    
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    
    games = []
    # Find game details, limited to 5.
    game_rows = soup.find_all("a", {"class": "search_result_row"})[:limit]
    for game in game_rows:
        game_title = game.find("span", class_="title").text.strip()
        app_id = game["data-ds-appid"]
        price_tag = game.find("div", {"class": "discount_final_price"})
        price = price_tag.text.strip() if price_tag else "Free"
        
        # Extract the review data.
        reviews_summary = game.find("span", {"class": "search_review_summary"})
        if reviews_summary:
            reviews_tooltip = reviews_summary.get("data-tooltip-html")
            if reviews_tooltip: 
                parts = reviews_tooltip.split("<br>")
                positive_percentage = parts[0].strip()
                review_part = parts[1]
                reviews_count = review_part.split("of")[-1].split("user")[0].replace("the", "").strip().replace(",", "") 
    
        # Extract supported platforms 
        os_supported = []
        if game.find("span", class_="platform_img win"):
            os_supported.append("Windows")
        if game.find("span", class_="platform_img mac"):
            os_supported.append("macOS")
        if game.find("span", class_="platform_img linux"):
            os_supported.append("Linux")
            
        # Appends game data to list.
        games.append({
            "Name": game_title,
            "Steam ID": app_id,
            "Price": price,
            "Number of Reviews": reviews_count,
            "Average Review": positive_percentage,
            "Operating Systems": ", ".join(os_supported)
        })
    
    return pd.DataFrame(games)

# Fetches and displays the popular VR games.
vr_games_df = fetch_popular_vr_games(limit=5)
print(vr_games_df)



                                                Name Steam ID   Price  \
0                                        War Thunder   236390    Free   
1                                       Phasmophobia   739630  $19.99   
2  Microsoft Flight Simulator 40th Anniversary Ed...  1250410  $59.99   
3                      HITMAN World of Assassination  1659040  $27.99   
4                                      Assetto Corsa   244210  $19.99   

  Number of Reviews           Average Review      Operating Systems  
0            546797          Mostly Positive  Windows, macOS, Linux  
1            578428  Overwhelmingly Positive                Windows  
2             58993          Mostly Positive                Windows  
3             28167            Very Positive                Windows  
4            104555            Very Positive                Windows  


### Data Usage
The data collected from Steam, including game names, prices, number of reviews, and positive review percentage will be used to explore key factors that will influence user engagement in VR games. One feature key part of data I am wanting to integrate into the function is genres, which I believe is very important in differentiating the types of games. One hicup is that the search does not have genre's provided, so I may need to make another function to extract genre data from the actual store-page. I also added platforms later on as I believed it was important in seeing if supporting more platforms than just windows could affect popularity. By analyzing relationships between these features, predicting future shifts where gaming and Virtual Reality itself could be possible. But importantly, improving user engagement will be more likely in bettering VR as a whole.
