In [1]:

instructions = """
As a seasoned musician, I excel at transforming general instructions into meticulously crafted musical terminology. My expertise lies in interpreting creative directions and translating them into precise musical elements, whether it be phrasing, dynamics, articulation, or orchestration. Based on the flag you choose, I will adapt my response accordingly:

---

### **Flag 1 (Generating)**
In this mode, I will convert your music-related instructions into detailed music-specific language, following a set structure that includes the following elements:

1. **Meter and Tempo**: Time signature and the pace (e.g., "4/4 time signature," "moderate tempo").
2. **Duration**: Total playtime of the music (e.g., "40 seconds," "31 ~ 45 seconds").
3. **Pitch Range**: Octave range and its impact on the music (e.g., "limited pitch range of 5 octaves").
4. **Key**: Tonality of the music (e.g., "major key," "minor key").
5. **Instrumental Arrangement**: The instruments used and their role in the piece (e.g., "grand piano, guitar, bass, violin, synthesizer, and drum").
6. **Musical Character/Emotion**: Descriptive terms about the mood or feeling of the music (e.g., "energetic beat," "filled with chill").
7. **Structural Elements**: Number of bars and other compositional elements (e.g., "13 ~ 16 bars").
8. **Style or Genre Influence**: Specific styles or influences that define the music (e.g., "pop sound," "classical genre").
9. **Additional Descriptive Elements**: Any unique qualities or character of the music (e.g., "vibrant and dynamic sound," "powerful and intense atmosphere").

This structured response helps generate a consistent, music-centric description that can be easily translated into technical or creative work.

---

**Template for Music Description:**

1. **Time Signature and Tempo:**
   - Example: "The composition is in a 4/4 time signature and is played at a moderate tempo."

2. **Duration:**
   - Example: "The piece lasts approximately 40 seconds."

3. **Pitch Range:**
   - Example: "The limited pitch range of 5 octaves allows for nuanced expression in tone and phrasing."

4. **Key and Tonality:**
   - Example: "Composed in a minor key, the piece evokes a resonant and introspective sound."

5. **Instrumentation:**
   - Example: "The music features a grand piano, guitar, bass, violin, synthesizer, and drums, each contributing to its distinct sonic character."

6. **Mood and Emotional Character:**
   - Example: "The music is characterized by a lively and dynamic mood, creating a sense of delight and energy."

7. **Structure and Form:**
   - Example: "The composition unfolds over 13 to 16 bars."

8. **Style or Genre Influence:**
   - Example: "The piece is influenced by pop music, blending modern elements with a traditional structure."

9. **Additional Descriptive Details:**
   - Example: "The rapid tempo combined with a distinctive rhythmic pattern creates an engaging and memorable listening experience."

---

**Example Query**: I want a happy music:
**Example Response**: 
The composition is set in a 4/4 meter, played at a lively and upbeat tempo, and spans approximately 35 seconds. The music is in a major key, which creates a bright and joyful atmosphere. It features a pitch range of 5 octaves, allowing for dynamic variation and energetic expression. The instrumental arrangement includes grand piano, acoustic guitar, bass, violin, and drums, each contributing to a vibrant and uplifting sound. The structure unfolds over 12 to 14 bars, and the cheerful rhythm, influenced by pop and folk genres, gives the music a catchy and engaging quality. This piece is defined by its playful mood and celebratory character, making it a memorable and enjoyable listening experience.

---

### **Flag 2 (Casual Chat)**
In this mode, I will engage in casual conversation about music or general topics. I can provide friendly and informative answers that do not require the structured music description.

For example, you may ask for general music advice, personal experiences with music, or even non-music-related casual conversations.

---

### **Function Control Commands**
In addition to handling music instructions, I can also perform actions related to the **Music X Machine**, the underlying music-tutoring system. Here are the three available functions I can execute:

1. **Wait**: Do nothing and wait for further stimuli, such as the student speaking or playing music.
2. **StartSession**: Start a Practice Session on Music X Machine. Do not call this function unless all modes are already set.
3. **InterruptSession**: Immediately end the Practice Session on Music X Machine. Call this when the student is having trouble or has started speaking in the middle of a Session.

---

In **Flag 1**, you can ask for a music description, and I will return a paragraph following the template.
In **Flag 2**, you can ask general questions or request to perform one of the available commands (e.g., "StartSession").

**Function Query Example**:
- Query: Start a new session.
- Response: Executing StartSession... Starting Practice Session on Music X Machine...

Use the flags to guide the conversation. Let me know if you want to generate music-specific instructions or engage in casual chat!
"""

from openai import OpenAI
text2text_client = OpenAI(api_key="")

assistant = text2text_client.beta.assistants.create(
  name="Chat Piano",
  instructions=instructions,
  tools=[],
  model="gpt-4o",
)

thread = text2text_client.beta.threads.create()


In [2]:
import time
import requests
from midi2audio import FluidSynth
from IPython.display import Audio, display

class TextToMidiClient:
    def __init__(self, base_url):
        self.base_url = base_url.rstrip('/')

    def submit_text(self, text):
        url = f"{self.base_url}/submit-text"
        headers = {'Content-Type': 'application/json'}
        data = {'text': text}

        response = requests.post(url, json=data, headers=headers)
        response.raise_for_status()
        return response.json()

    def check_status(self, job_id):
        url = f"{self.base_url}/check-status/{job_id}"
        response = requests.get(url)
        response.raise_for_status()
        return response.json()

    def get_result(self, job_id):
        url = f"{self.base_url}/get-result/{job_id}"
        response = requests.get(url)
        response.raise_for_status()
        return response.json()

    def download_midi(self, job_id, save_path):
        url = f"{self.base_url}/download-midi/{job_id}"
        response = requests.get(url, stream=True)
        response.raise_for_status()

        with open(save_path, 'wb') as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        return save_path
    
# Initialize the client with the base URL of your Flask server
text2midi_client = TextToMidiClient(base_url="http://localhost:5000")


In [3]:
# Define an assistant tool to handle music conversion
def convert_text_to_midi(text_command):
    try:
        # Step 1: Submit text for MIDI conversion
        submit_response = text2midi_client.submit_text(text_command)
        job_id = submit_response['jobId']
        print(f"Job submitted. Job ID: {job_id}")

        # Step 2: Poll the job status until it's completed
        while True:
            status_response = text2midi_client.check_status(job_id)
            status = status_response['status']
            print(f"Job Status: {status}")
            if status == 'completed':
                break
            elif status == 'failed':
                print("Job failed.")
                return None
            time.sleep(2)  # Wait for 2 seconds before checking again

        # Step 3: Retrieve the result (metadata)
        result_response = text2midi_client.get_result(job_id)
        meta_data = result_response['metaData']
        print("Metadata:")
        print(meta_data)

        # Step 4: Download the MIDI file
        midi_file_path = text2midi_client.download_midi(job_id, save_path=f"storage/{job_id}.mid")
        print(f"MIDI file downloaded to {midi_file_path}")

        return midi_file_path, meta_data

    except requests.HTTPError as http_err:
        print(f"HTTP error occurred: {http_err.response.json()}")
        return None
    except Exception as err:
        print(f"An error occurred: {err}")
        return None

In [4]:
def process_input(input_text):
    """Process input based on flag mode and send to assistant."""
    message = text2text_client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content=input_text
    )

    run = text2text_client.beta.threads.runs.create_and_poll(
        thread_id=thread.id,
        assistant_id=assistant.id
    )
    
    return run

In [5]:
import threading
import queue
import time
import re

# Assuming previously defined TextToMidiClient and text2midi_client

# Class to control Music X Machine
class ControlMusicX:
    def __init__(self):
        self.session_active = False

    def start_session(self):
        if not self.session_active:
            print("Starting Practice Session on Music X Machine...")
            self.session_active = True
        else:
            print("Session is already active.")

    def interrupt_session(self):
        if self.session_active:
            print("Interrupting Practice Session on Music X Machine...")
            self.session_active = False
        else:
            print("No session to interrupt.")

    def wait(self):
        print("Waiting for further stimuli...")

# Initialize ControlMusicX
music_x = ControlMusicX()

def handle_midi_conversion(text_command, result_queue):
    """Thread function to handle MIDI conversion and store results."""
    midi_result = convert_text_to_midi(text_command)
    result_queue.put(midi_result)  # Place the result in a queue for the main thread to retrieve.

def detect_commands(ai_response):
    """Detect commands in AI response and handle them."""
    if re.search(r'\bStartSession\b', ai_response, re.IGNORECASE):
        music_x.start_session()
        return "StartSession"
    elif re.search(r'\bInterruptSession\b', ai_response, re.IGNORECASE):
        music_x.interrupt_session()
        return "InterruptSession"
    elif re.search(r'\bWait\b', ai_response, re.IGNORECASE):
        music_x.wait()
        return "Wait"
    return None

def main():
    conversion_thread = None
    result_queue = queue.Queue()

    try:
        while True:
            # Check if there's a MIDI conversion in progress
            if conversion_thread and conversion_thread.is_alive():
                # Optionally, inform the user about the conversion status
                print("MIDI conversion is in progress...")
            else:
                # If the MIDI conversion has finished, handle the result
                if conversion_thread and not conversion_thread.is_alive():
                    if not result_queue.empty():
                        midi_result = result_queue.get()
                        if midi_result:
                            midi_file_path, meta_data = midi_result
                            print(f"MIDI file saved at: {midi_file_path}")
                            print("Metadata: ", meta_data)
                        else:
                            print("MIDI conversion failed.")
                        conversion_thread = None  # Reset the thread
            # Get user input
            user_input = input("You: ")
            if user_input.lower() == 'exit':
                break
            # Prepare the assistant input without exposing flags to the user
            assistant_input = "Flag 2: " + user_input
            # Send input to the assistant
            run = process_input(assistant_input)

            if run.status == 'completed':
                messages = text2text_client.beta.threads.messages.list(thread_id=thread.id)
                ai_response = messages.data[-1].content[0].text.value  # Get the latest response
                print("Assistant:", ai_response)

                # Detect commands in the AI response
                command = detect_commands(ai_response)

                if command == "StartSession":
                    # In generating mode, the assistant provides the music description
                    # Get user input
                    user_input = input("You: ")
                    # Prepare the assistant input without exposing flags to the user
                    assistant_input = "Flag 1: " + user_input + "[return the instruction only, do not contain any further instruction about the way it plays, only generate 10 seconds.]"
                    # Send input to the assistant
                    run = process_input(assistant_input)
                    if run.status == 'completed':
                        messages = text2text_client.beta.threads.messages.list(thread_id=thread.id)
                        ai_response = messages.data[-1].content[0].text.value  # Get the latest response
                        print("Assistant:", ai_response)
                        # Start the MIDI conversion using the assistant's response
                        conversion_thread = threading.Thread(target=handle_midi_conversion, args=(ai_response, result_queue))
                        conversion_thread.start()
                elif command == "InterruptSession":
                    pass
                elif command == "Wait":
                    pass
                else:
                    pass

            else:
                print(f"Assistant run status: {run.status}")

            # Allow some time before the next iteration to prevent overwhelming the assistant
            time.sleep(0.5)

    except KeyboardInterrupt:
        print("Program exited by user.")

    # Ensure that any remaining thread is completed before exiting
    if conversion_thread and conversion_thread.is_alive():
        conversion_thread.join()
        if not result_queue.empty():
            midi_result = result_queue.get()
            if midi_result:
                midi_file_path, meta_data = midi_result
                print(f"MIDI file saved at: {midi_file_path}")
                print("Metadata: ", meta_data)
            else:
                print("MIDI conversion failed.")

In [None]:
main()


Assistant: Flag 2: 1
Assistant run status: failed
Assistant: Flag 2: 1
