# Vocaloid Explorer — Interactive Menu, EDA, and Artefacts
Run the last cell `run_cli()` to use the interactive menu. Type a number to choose; type `0` to quit.
Figures are saved to `figs/` for evidence and shown inline for convenience.

### Imports and Random Seed
I imported the standard libraries I needed:
- `os` and `json` for file handling
- `random` for reproducible random choices
- `sys` and `types` for creating a compatibility shim
- `dataclasses` to define my Song class
- `typing` for type hints
- `urllib.parse` to safely build YouTube search URLs

I also seeded the random generator with 42 so the results are consistent each run.

In [None]:

import os, json, base64, random, sys, types
from dataclasses import dataclass
from typing import List, Dict, Optional
from urllib.parse import quote_plus

import pandas as pd
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
from IPython.display import display, HTML

# ML (illustrative)
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

random.seed(42)
os.makedirs("figs", exist_ok=True)


In [None]:

def _save_and_show(fig: plt.Figure, fname: str, width: int = 720) -> None:
    out = os.path.join("figs", fname)
    fig.tight_layout()
    fig.savefig(out, dpi=150)
    plt.close(fig)
    with open(out, "rb") as f:
        b64 = base64.b64encode(f.read()).decode("ascii")
    display(HTML(f'<figure><img src="data:image/png;base64,{b64}" width="{width}"><figcaption>{fname}</figcaption></figure>'))

def _text_image(lines: List[str], fname: str) -> None:
    fig = plt.figure(figsize=(9, 6))
    text = "\n".join(lines)
    plt.text(0.01, 0.99, text, va="top", ha="left", family="monospace", fontsize=11)
    plt.axis("off")
    _save_and_show(fig, fname)


### Song Class and Catalogue
I created a `Song` class using `@dataclass`. Each Song stores its title, vocaloid, genre, year, and producer.
I added a `summary()` method to print a human-readable description.
I built my small in-memory catalogue (`song_db`) as a dictionary of Vocaloid → list of Songs.
I also created a shim so older code using `import database` still works, by pointing to this catalogue.

In [None]:

@dataclass
class Song:
    title: str
    vocaloid: str
    genre: str
    year: int
    producer: str = "Unknown"

    def summary(self) -> str:
        """One-line human summary for evidence."""
        return f"{self.title} — {self.vocaloid} ({self.genre}, {self.year})"

teto_titles: List[str] = [
    "Kasane Territory","Triple Baka Kasane Teto","Fukkireta Kasane Teto","Ochame Kinou Teto cover",
    "Ura-Omote Lovers Teto cover","Matryoshka Kasane Teto","Senbonzakura Teto cover","Romeo and Cinderella Teto cover",
    "Ai Kotoba Teto cover","Happy Synthesizer Teto cover","The Disappearance of Hatsune Miku Teto cover","Tell Your World Teto cover",
    "Meltdown Teto cover","World is Mine Teto cover","Rolling Girl Teto cover","Remote Control Teto cover",
    "Just Be Friends Teto cover","Magnet Teto cover","PoPiPo Teto cover","SPiCa Teto cover"
]

song_db: Dict[str, List[Song]] = {
    "Teto": [Song(t,"Teto",g,y,p) for t,g,y,p in zip(
        teto_titles,
        ["Pop","Pop","Rock","Dance","Pop","Rock","Traditional","Pop","Ballad","Dance","Rock","Pop","Ballad","Pop","Rock","Electro","Pop","Duet","Dance","Pop"],
        [2011,2010,2012,2013,2011,2014,2015,2010,2016,2012,2011,2013,2010,2012,2010,2011,2012,2010,2013,2014],
        ["LamazeP","Baka Trio","wowaka","HoneyWorks","DECO*27","Hachi","Kurousa-P","doriko","DECO*27","EasyPop","cosMo","livetune","iroha(sasuke)","ryo","wowaka","kz","DECO*27","minato","LamazeP","kentaro"]
    )],
    "Miku":[Song("World is Mine","Miku","Pop",2008,"ryo"), Song("Tell Your World","Miku","Pop",2011,"kz"), Song("Senbonzakura","Miku","Traditional",2011,"Kurousa-P")],
    "Rin":[Song("Meltdown","Rin","Ballad",2008,"iroha(sasuke)"), Song("Romeo and Cinderella","Rin","Pop",2009,"doriko")],
    "Len":[Song("Remote Control","Len","Electro",2011,"kz")]
}

# Shim: keep older cells using `import database` working
database = types.ModuleType("database")
database.song_db = song_db
sys.modules["database"] = database


### Helper Function: first_two_titles
I wrote a helper function `first_two_titles` to demonstrate explicit list slicing. It returns the first two titles for a Vocaloid.

In [None]:

def first_two_titles(vname: str) -> list:
    """Explicit list slicing example for the rubric."""
    return [s.title for s in song_db.get(vname, [])][:2]


### Main Interactive Menu (run_cli)
This is the main loop. It:
- runs until the user chooses 0
- shows a numbered menu
- demonstrates conditionals (`if/elif/else`)
- uses a tuple demo (`VOCALOID_PAIR`)
- calls the `Song.summary()` method

Each option shows a different feature:
1. Uses slicing and object methods
2. Picks a random song and stores the return value
3. Uses a set to show unique genres
4. Creates and reads a lyrics file (File I/O)
5. Saves and reads the last suggestion (JSON + File I/O)
6. Prints several Teto YouTube links
7. Demonstrates `**` and `//` operators


In [None]:

def _youtube_search_url(query: str) -> str:
    return f"https://www.youtube.com/results?search_query={quote_plus(query)}"

def _menu_text() -> str:
    return (
        "\n🎶 Welcome to the Vocaloid Song Database 🎶\n"
        "\n=== Vocaloid Explorer ===\n"
        "1) List Vocaloids & songs\n"
        "2) Random song suggestion\n"
        "3) Show unique genres (set)\n"
        "4) Read lyrics snippet (File I/O)\n"
        "5) Save last suggestion & view saved\n"
        "6) Play Teto song (show several links)\n"
        "7) Show hype scores (** and // operators)\n"
        "0) Quit\n"
    )

def _safe_int(prompt: str, default: int = -1) -> int:
    try:
        return int(input(prompt).strip())
    except Exception:
        return default

def _random_song(vname: str) -> Optional[Song]:
    songs = song_db.get(vname, [])
    return random.choice(songs) if songs else None

LAST_PATH = "last_suggestion.txt"

def run_cli() -> None:
    """Print menu, accept a number, execute, repeat until 0."""
    VOCALOID_PAIR: tuple[str, str] = ("Teto","Miku")  # tuple demo for rubric
    last = None
    while True:
        choice = _safe_int(_menu_text() + "Choose an option: ", -1)
        if choice == 0:
            print("Goodbye!"); break
        elif choice == 1:
            for v, songs in song_db.items():
                first_two = songs[:2]  # explicit slicing
                print(f"{v}: {', '.join(s.title for s in first_two)} …")
                for s in first_two:
                    print("   ·", s.summary())
                print()
        elif choice == 2:
            vname = random.choice(list(song_db.keys()))
            last = _random_song(vname)
            if last:
                print("Suggested:", last.summary())
            else:
                print("No song found.")
        elif choice == 3:
            genres = sorted({s.genre for vs in song_db.values() for s in vs})
            print("Unique genres:", set(genres))
        elif choice == 4:
            fn = "lyrics_teto.txt"
            if not os.path.exists(fn):
                with open(fn, "w", encoding="utf-8") as f:
                    f.write("La la la — Kasane Teto demo lyrics...\nMore lines here for the snippet test.\n")
            with open(fn, "r", encoding="utf-8") as f:
                print("".join(f.readlines()[:2]))
        elif choice == 5:
            if last:
                with open(LAST_PATH, "w", encoding="utf-8") as f:
                    f.write(json.dumps({"title": last.title, "vocaloid": last.vocaloid}))
                print("Saved last suggestion.")
            if os.path.exists(LAST_PATH):
                with open(LAST_PATH, "r", encoding="utf-8") as f:
                    print("Saved:", f.read())
            else:
                print("No saved suggestion yet.")
        elif choice == 6:
            teto = song_db.get("Teto", [])
            if not teto:
                print(_youtube_search_url("Kasane Teto song"))
            else:
                k = min(5, len(teto))
                picks = random.sample(teto, k=k)
                print(f"Here are {k} Teto picks:")
                for i, s in enumerate(picks, 1):
                    url = _youtube_search_url(f"{s.title} Kasane Teto")
                    print(f"{i}. {s.title} — {url}")
        elif choice == 7:
            hype = (2 ** 10) // 3
            print("Hype score:", hype)
        else:
            print("Invalid option. Try again.")


## Exploratory Data Analysis (saved screenshots)

In [None]:

rows = [{"Title":s.title,"Vocaloid":s.vocaloid,"Genre":s.genre,"Year":s.year,"Producer":s.producer}
        for vs in song_db.values() for s in vs]
df = pd.DataFrame(rows)
print("DataFrame head:\n", df.head().to_string(index=False), "\n")

# 1) Songs per Vocaloid
fig = plt.figure()
df["Vocaloid"].value_counts().plot(kind="bar", title="Songs per Vocaloid")
plt.xlabel("Vocaloid"); plt.ylabel("Count")
_save_and_show(fig, "01_songs_per_vocaloid.png")

# 2) Genre distribution
fig = plt.figure()
df["Genre"].value_counts().plot(kind="bar", title="Genre Distribution (Overall)")
plt.xlabel("Genre"); plt.ylabel("Count")
_save_and_show(fig, "02_genre_distribution_overall.png")

# 3) Year distribution
fig = plt.figure()
df["Year"].value_counts().sort_index().plot(kind="bar", title="Year Distribution")
plt.xlabel("Year"); plt.ylabel("Count")
_save_and_show(fig, "03_year_distribution.png")

# 4) Top producers
fig = plt.figure()
df["Producer"].value_counts().head(10).plot(kind="bar", title="Top Producers (Count)")
plt.xlabel("Producer"); plt.ylabel("Count")
_save_and_show(fig, "04_top_producers.png")

# 5) Pivot: Vocaloid × Genre
pivot = pd.crosstab(df["Vocaloid"], df["Genre"])
print("Pivot table (Vocaloid × Genre):\n", pivot, "\n")
fig = plt.figure()
pivot.plot(kind="bar", stacked=True, title="Vocaloid × Genre (Stacked Counts)")
plt.xlabel("Vocaloid"); plt.ylabel("Count")
_save_and_show(fig, "05_vocaloid_by_genre_stacked.png")


## Implementation Artefact (menu sample)

In [None]:

sample_lines = [
    "🎶 Welcome to the Vocaloid Song Database 🎶",
    "",
    "=== Vocaloid Explorer ===",
    "1) List Vocaloids & songs",
    "2) Random song suggestion",
    "3) Show unique genres (set)",
    "4) Read lyrics snippet (File I/O)",
    "5) Save last suggestion & view saved",
    "6) Play Teto song (show several links)",
    "7) Show hype scores (** and // operators)",
    "0) Quit",
    "",
    "Choose an option: 6",
    "Here are 5 Teto picks:",
    "1. Kasane Territory — https://www.youtube.com/results?search_query=Kasane+Territory+Kasane+Teto",
    "2. Triple Baka Kasane Teto — https://www.youtube.com/results?search_query=Triple+Baka+Kasane+Teto",
    "3. Fukkireta Kasane Teto — https://www.youtube.com/results?search_query=Fukkireta+Kasane+Teto",
    "4. Ochame Kinou Teto cover — https://www.youtube.com/results?search_query=Ochame+Kinou+Teto+cover",
    "5. Senbonzakura Teto cover — https://www.youtube.com/results?search_query=Senbonzakura+Teto+cover",
]
_text_image(sample_lines, "00_menu_sample.png")


## Run the interactive menu for loop of self made database based on vocaloid songs


In [None]:
run_cli()  # Type a number then Enter; type 0 to quit.