<a href="https://colab.research.google.com/github/venezianof/booksum/blob/main/notebooks/en/enterprise_cookbook_gradio.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Creating Demos with Spaces and Gradio
_Authored by: [Diego Maniloff](https://huggingface.co/dmaniloff)_

In [5]:
import pandas as pd

# Assuming analysis_output_enhanced_longevity is already available in the kernel state
if 'analysis_output_enhanced_longevity' in locals() and \
   'correlation_analysis' in analysis_output_enhanced_longevity and \
   'pearson_correlation_matrix' in analysis_output_enhanced_longevity['correlation_analysis']:

    correlation_matrix_dict = analysis_output_enhanced_longevity['correlation_analysis']['pearson_correlation_matrix']

    if isinstance(correlation_matrix_dict, dict):
        correlation_df = pd.DataFrame(correlation_matrix_dict)
        print("Matrice di Correlazione di Pearson:\n")
        display(correlation_df)
    else:
        print(correlation_matrix_dict)
else:
    print("L'analisi avanzata non √® stata ancora eseguita o i risultati di correlazione non sono disponibili.")
    print("Eseguire la cella `3b5f19b5` per generare l'output.\n")


Matrice di Correlazione di Pearson:



Unnamed: 0,glp1_level,glucose,insulin,geneA_meth
glp1_level,1.0,0.775114,0.815492,0.996791
glucose,0.775114,1.0,0.997176,0.722806
insulin,0.815492,0.997176,1.0,0.767772
geneA_meth,0.996791,0.722806,0.767772,1.0


In [None]:
import gradio as gr
import pandas as pd

def gradio_analyze_glp1(metabolic_data_input: list, epigenetic_markers_input: list, glp1_levels_str: str) -> dict:
    """
    Wrapper function to convert Gradio inputs to the formats expected by
    analyze_glp1_epigenetics_metabolism and call it.
    """
    # Convert list of lists (from gr.Dataframe) to pandas DataFrames
    # Assuming the first row is headers for now, if not, adjust logic.
    try:
        if metabolic_data_input and len(metabolic_data_input) > 1:
            metabolic_data_df = pd.DataFrame(metabolic_data_input[1:], columns=metabolic_data_input[0])
        else:
            metabolic_data_df = pd.DataFrame()

        if epigenetic_markers_input and len(epigenetic_markers_input) > 1:
            epigenetic_markers_df = pd.DataFrame(epigenetic_markers_input[1:], columns=epigenetic_markers_input[0])
        else:
            epigenetic_markers_df = pd.DataFrame()

        # Convert comma-separated string of GLP-1 levels to a list of floats
        glp1_levels_list = [float(level.strip()) for level in glp1_levels_str.split(',') if level.strip()]

        return analyze_glp1_epigenetics_metabolism(
            metabolic_data=metabolic_data_df,
            epigenetic_markers=epigenetic_markers_df,
            glp1_levels=glp1_levels_list
        )
    except ValueError as e:
        raise gr.Error(f"Errore nella conversione degli input: {e}. Assicurati che i dati siano nel formato corretto.")
    except Exception as e:
        raise gr.Error(f"Si √® verificato un errore: {e}")

# Create Gradio Interface
glp1_demo = gr.Interface(
    fn=gradio_analyze_glp1,
    inputs=[
        gr.Dataframe(headers=["glucose", "insulin"], type="array", label="Dati Metabolici", value=dummy_metabolic_data.values.tolist()),
        gr.Dataframe(headers=["geneA_meth", "geneB_histone"], type="array", label="Marcatori Epigenetici", value=dummy_epigenetic_markers.values.tolist()),
        gr.Textbox(label="Livelli GLP-1 (numeri separati da virgola)", value=", ".join(map(str, dummy_glp1_levels)))
    ],
    outputs=gr.JSON(label="Risultati dell'Analisi"),
    title="Analisi GLP-1, Epigenetica e Metabolismo",
    description="Inserisci i dati metabolici, i marcatori epigenetici e i livelli di GLP-1 per avviare l'analisi."
)

# Launch the demo
glp1_demo.launch()

In [None]:
import pandas as pd

def analyze_glp1_epigenetics_metabolism(metabolic_data: pd.DataFrame, epigenetic_markers: pd.DataFrame, glp1_levels: list) -> dict:
    """
    Placeholder function to simulate the analysis of GLP-1, epigenetics, and metabolism.
    In a real scenario, this would involve complex bioinformatics and statistical analysis.
    """
    print("Performing preliminary analysis...")

    # Example placeholder for metabolic and epigenetic correlation
    num_metabolic_samples = len(metabolic_data)
    num_epigenetic_samples = len(epigenetic_markers)
    avg_glp1 = sum(glp1_levels) / len(glp1_levels) if glp1_levels else 0

    # Simulate some findings
    results = {
        "message": "Analysis initiated. Further sophisticated models would be required here.",
        "metabolic_samples_processed": num_metabolic_samples,
        "epigenetic_markers_analyzed": num_epigenetic_samples,
        "average_glp1_level": f"{avg_glp1:.2f} pg/mL",
        "potential_longevity_impact": "Requires deeper longitudinal study"
    }

    return results

# Example usage (dummy data)
dummy_metabolic_data = pd.DataFrame({'glucose': [90, 110, 85], 'insulin': [5, 12, 4]})
dummy_epigenetic_markers = pd.DataFrame({'geneA_meth': [0.1, 0.3, 0.2], 'geneB_histone': ['ac', 'me', 'ac']})
dummy_glp1_levels = [15, 22, 18]

# Call the placeholder function
analysis_output = analyze_glp1_epigenetics_metabolism(
    metabolic_data=dummy_metabolic_data,
    epigenetic_markers=dummy_epigenetic_markers,
    glp1_levels=dummy_glp1_levels
)
print(analysis_output)

## Introduction
In this notebook we will demonstrate how to bring any machine learning model to life using [Gradio](https://www.gradio.app/), a library that allows you to create a web demo from any Python function and share it with the world üåé!

üìö This notebook covers:
- Building a `Hello, World!` demo: The basics of Gradio
- Moving your demo to Hugging Face Spaces
- Making it interesting: a real-world example that leverages the ü§ó Hub
- Some of the cool "batteries included" features that come with Gradio

‚è≠Ô∏è At the end of this notebook you will find a `Further Reading` list with links to keep going on your own.

## Setup
To get started install the `gradio` library along with `transformers`.

In [None]:
!pip -q install gradio==4.36.1
!pip -q install transformers==4.41.2

In [None]:
# the usual shorthand is to import gradio as gr
import gradio as gr

## Your first demo: the basics of Gradio
At its core, Gradio turns any Python function into a web interface.

Say we have a simple function that takes `name` and `intensity` as parameters, and returns a string like so:


In [None]:
def greet(name: str, intensity: int) -> str:
    return "Hello, " + name + "!" * int(intensity)

If you run this function for the name 'Diego' you will get an output string that looks like this:


In [None]:
print(greet("Diego", 3))

Hello, Diego!!!


With Gradio, we can build an interface for this function via the `gr.Interface` class. All we need to do is pass in the `greet` function we created above, and the kinds of inputs and outputs that our function expects:


In [None]:
demo = gr.Interface(
    fn=greet,
    inputs=["text", "slider"], # the inputs are a text box and a slider ("text" and "slider" are components in Gradio)
    outputs=["text"],          # the output is a text box
)

Notice how we passed in `["text", "slider"]` as inputs and `["text"]` as outputs -- these are called [Components](https://www.gradio.app/docs/gradio/introduction) in Gradio.

That's all we need for our first demo. Go ahead and try it out üëáüèº! Type your name into the `name` textbox, slide the intensity that you want, and click `Submit`.

In [None]:
# the launch method will fire up the interface we just created
demo.launch()

## Let's make it interesting: a meeting transcription tool
At this point you understand how to take a basic Python function and turn it into
a web-ready demo. However, we only did this for a function that is very simple, a bit boring even!

Let's consider a more interesting example that highlights the very thing that Gradio was built for: demoing cutting-edge machine learning models. A good friend of mine recently asked me for help with an audio recording of an interview she had done. She needed to convert the audio file into a well-organized text summary. How did I help her? I built a Gradio app!

Let's walk through the steps to build the meeting transcription tool. We can think of the process as two parts:


1. Transcribe the audio file into text
2. Organize the text into sections, paragraphs, lists, etc. We could include summarization here too.



### Audio-to-text
In this part we will build a demo that handles the first step of the meeting transcription tool: converting audio into text.

As we learned, the key ingredient to building a Gradio demo is to have a Python function that executes the logic we are trying to showcase. For the audio-to-text conversion, we will build our function using the awesome `transformers` library and its `pipeline` utility to use a popular audio-to-text model called `distil-whisper/distil-large-v3`.

The result is the following `transcribe` function, which takes as input the audio that we want to convert:



In [None]:
import os
import tempfile

import torch
import gradio as gr
from transformers import pipeline

device = 0 if torch.cuda.is_available() else "cpu"

AUDIO_MODEL_NAME = "distil-whisper/distil-large-v3" # faster and very close in performance to the full-size "openai/whisper-large-v3"
BATCH_SIZE = 8


pipe = pipeline(
    task="automatic-speech-recognition",
    model=AUDIO_MODEL_NAME,
    chunk_length_s=30,
    device=device,
)


def transcribe(audio_input):
    """Function to convert audio to text."""
    if audio_input is None:
        raise gr.Error("No audio file submitted!")

    output = pipe(
        audio_input,
        batch_size=BATCH_SIZE,
        generate_kwargs={"task": "transcribe"},
        return_timestamps=True
    )
    return output["text"]

Now that we have our Python function, we can demo that by passing it into `gr.Interface`. Notice how in this case the input that the function expects is the audio that we want to convert. Gradio includes a ton useful components, one of which is [Audio](https://www.gradio.app/docs/gradio/audio), exactly what we need for our demo üé∂ üòé.


In [None]:
part_1_demo = gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(type="filepath"), # "filepath" passes a str path to a temporary file containing the audio
    outputs=gr.Textbox(show_copy_button=True), # give users the option to copy the results
    title="Transcribe Audio to Text", # give our demo a title :)
)

part_1_demo.launch()

Go ahead and try it out üëÜ! You can upload an `.mp3` file or hit the üé§ button to record your own voice.

For a sample file with an actual meeting recording, you can check out the [MeetingBank_Audio dataset](https://huggingface.co/datasets/huuuyeah/MeetingBank_Audio) which is a dataset of meetings from city councils of 6 major U.S. cities. For my own testing, I tried out a couple of the [Denver meetings](https://huggingface.co/datasets/huuuyeah/MeetingBank_Audio/blob/main/Denver/mp3/Denver-21.zip).

> [!TIP]
> Also check out `Interface`'s [from_pipeline](https://www.gradio.app/docs/gradio/interface#interface-from_pipeline) constructor which will directly build the `Interface` from a `pipeline`.

### Organize & summarize text
For part 2 of the meeting transcription tool, we need to organize the transcribed text from the previous step.

Once again, to build a Gradio demo we need the Python function with the logic that we care about. For text organization and summarization, we will use an "instruction-tuned" model that is trained to follow a broad range of tasks. There are many options to pick from such as [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3). For our example we are going to use [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct).


Just like for part 1, we could leverage the `pipeline` utility within `transformers` to do this, but instead we will take this opportunity to showcase the [Serverless Inference API](https://huggingface.co/docs/api-inference/index), which is an API within the Hugging Face Hub that allows us to use thousands of publicly accessible (or your own privately permissioned) machine learning models ***for free***! Check out the cookbook section of the Serverless Inferfence API [here](https://huggingface.co/learn/cookbook/en/enterprise_hub_serverless_inference_api).



Using the Serverless Inferfence API means that instead of calling a model via a pipeline (like we did for the audio conversion part), we will call it from the `InferenceClient`, which is part of the `huggingface_hub` library ([Hub Python Library](https://huggingface.co/docs/huggingface_hub/en/package_reference/login)). And in turn, to use the `InferenceClient`, we need to log into the ü§ó Hub using `notebook_login()`, which will produce a dialog box asking for your User Access Token to authenticate with the Hub.

You can manage your tokens from your [personal settings page](https://huggingface.co/settings/tokens), and please remember to use [fine-grained](https://huggingface.co/docs/hub/security-tokens) tokens as much as possible for enhanced security.

In [None]:
from huggingface_hub import notebook_login, InferenceClient

# running this will prompt you to enter your Hugging Face credentials
notebook_login()

Now that we are logged into the Hub, we can write our text processing function using the Serverless Inference API via `InferenceClient`.

The code for this part will be structured into two functions:

- `build_messages`, to format the message prompt into the LLM;
- `organize_text`, to actually pass the raw meeting text into the LLM for organization (and summarization, depending on the prompt we provide).


In [None]:
# sample meeting transcript from huuuyeah/MeetingBank_Audio
# this is just a copy-paste from the output of part 1 using one of the Denver meetings
sample_transcript = """
 Good evening. Welcome to the Denver City Council meeting of Monday, May 8, 2017. My name is Kelly Velez. I'm your Council Secretary. According to our rules of procedure, when our Council President, Albus Brooks, and Council President Pro Tem, JoLynn Clark, are both absent, the Council Secretary calls the meeting to order. Please rise and join Councilman Herndon in the Pledge of Allegiance. Madam Secretary, roll call. Roll call. Here. Mark. Espinosa. Here. Platt. Delmar. Here. Here. Here. Here. We have five members present. There is not a quorum this evening. Many of the council members are participating in an urban exploration trip in Portland, Oregon, pursuant to Section 3.3.4 of the city charter. Because there is not a quorum of seven council members present, all of tonight's business will move to next week, to Monday, May 15th. Seeing no other business before this body except to wish Councilwoman Keniche a very happy birthday this meeting is adjourned Thank you. A standard model and an energy efficient model likely will be returned to you in energy savings many times during its lifespan. Now, what size do you need? Air conditioners are not a one-size-or-type fits all. Before you buy an air conditioner, you need to consider the size of your home and the cost to operate the unit per hour. Do you want a room air conditioner, which costs less but cools a smaller area, or do you want a central air conditioner, which cools your entire house but costs more? Do your homework. Now, let's discuss evaporative coolers. In low humidity areas, evaporating water into the air provides a natural and energy efficient means of cooling. Evaporative coolers, also called swamp coolers, cool outdoor air by passing it over water saturated pads, causing the water to evaporate into it. Evaporative coolers cost about one half as much to install as central air conditioners and use about one-quarter as much energy. However, they require more frequent maintenance than refrigerated air conditioners, and they're suitable only for areas with low humidity. Watch the maintenance tips at the end of this segment to learn more. And finally, fans. When air moves around in your home, it creates a wind chill effect. A mere two-mile-an-hour breeze will make your home feel four degrees cooler and therefore you can set your thermostat a bit higher. Ceiling fans and portable oscillating fans are cheap to run and they make your house feel cooler. You can also install a whole house fan to draw the hot air out of your home. A whole house fan draws cool outdoor air inside through open windows and exhausts hot room air through the attic to the outside. The result is excellent ventilation, lower indoor temperatures, and improved evaporative cooling. But remember, there are many low-cost, no-cost ways that you can keep your home cool. You should focus on these long before you turn on your AC or even before you purchase an AC. But if you are going to purchase a new cooling system, remember to get one that's energy efficient and the correct size for your home. Wait, wait, don't go away, there's more. After this segment of the presentation is over, you're going to be given the option to view maintenance tips about air conditioners and evaporative coolers. Now all of these tips are brought to you by the people at Xcel Energy. Thanks for watching.
"""

In [None]:
from huggingface_hub import InferenceClient

TEXT_MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"

client = InferenceClient()

def organize_text(meeting_transcript):
    messages = build_messages(meeting_transcript)
    response = client.chat_completion(
        messages, model=TEXT_MODEL_NAME, max_tokens=250, seed=430
    )
    return response.choices[0].message.content


def build_messages(meeting_transcript) -> list:
    system_input = "You are an assitant that organizes meeting minutes."
    user_input = """Take this raw meeting transcript and return an organized version.
    Here is the transcript:
    {meeting_transcript}
    """.format(meeting_transcript=meeting_transcript)

    messages = [
        {"role": "system", "content": system_input},
        {"role": "user", "content": user_input},
    ]
    return messages




And now that we have our text organization function `organize_text`, we can build a demo for it as well:

In [None]:
part_2_demo = gr.Interface(
    fn=organize_text,
    inputs=gr.Textbox(value=sample_transcript),
    outputs=gr.Textbox(show_copy_button=True),
    title="Clean Up Transcript Text",
)
part_2_demo.launch()

Go ahead and try it out üëÜ! If you hit "Submit" in the demo above, you will see that the output text is a much clearer and organized version of the transcript, with a title and sections for the different parts of the meeting.

See if you can get a summary by playing around with the `user_input` variable that controls the LLM prompt.

### Putting it all together
At this point we have a function for each of the two steps we want out meeting transcription tool to do:
1. convert the audio into a text file, and
2. organize that text file into a nicely-formatted meeting document.

All we have to do next is stitch these two functions together and build a demo for the combined steps. In other words, our complete meeting transcription tool is just a new function (which we'll creatively call `meeting_transcript_tool` üòÄ) that takes the output of `transcribe` and passes it into `organize_text`:


In [None]:
def meeting_transcript_tool(audio_input):
    meeting_text = transcribe(audio_input)
    organized_text = organize_text(meeting_text)
    return organized_text


full_demo = gr.Interface(
    fn=meeting_transcript_tool,
    inputs=gr.Audio(type="filepath"),
    outputs=gr.Textbox(show_copy_button=True),
    title="The Complete Meeting Transcription Tool",
)
full_demo.launch()


Go ahead and try it out üëÜ! This is now the full demo of our transcript tool. If you give it an audio file, the output will be the already-organized (and potentially summarized) version of the meeting. Super cool üòé.

## Move your demo into ü§ó Spaces
If you made it this far, now you know the basics of how to create a demo of your machine learning model using Gradio üëè!

Up next we are going to show you how to take your brand new demo to Hugging Face Spaces. On top of the ease of use and powerful features of Gradio, moving your demo to ü§ó Spaces gives you the benefit of permanent hosting, ease of deployment each time you update your app, and the ability to share your work with anyone! Do keep in mind that your Space will go to sleep after a while unless you are using it or making changes to it.


The first step is to head over to [https://huggingface.co/new-space](https://huggingface.co/new-space), select "Gradio" from the templates, and leave the rest of the options as default for now (you can change these later):

<img src="https://github.com/dmaniloff/public-screenshots/blob/main/create-new-space.png?raw=true" width="350" alt="image description">

This will result in a newly created Space that you can populate with your demo code. As an example for you to follow, I created the ü§ó Space `dmaniloff/meeting-transcript-tool`, which you can access [here](https://huggingface.co/spaces/dmaniloff/meeting-transcript-tool).

There are two files we need to edit:

*   `app.py` -- This is where the demo code lives. It should look something like this:
      ```python
      # outline of app.py:

      def meeting_transcript_tool(...):
         ...

      def transcribe(...):
         ...

      def organize_text(...):
         ...

      ```

*   `requirements.txt` -- This is where we tell our Space about the libraries it will need. It should look something like this:
      ```
      # contents of requirements.txt:
      torch
      transformers
      ```


## Gradio comes with batteries included üîã

Gradio comes with lots of cool functionality right out of the box. We won't be able to cover all of it in this notebook, but here's 3 that we will check out:

- Access as an API
- Sharing via public URL
- Flagging



### Access as an API
One of the benefits of building your web demos with Gradio is that you automatically get an API üôå! This means that you can access the functionality of your Python function using a standard HTTP client like `curl` or the Python `requests` library.

If you look closely at the demos we created above, you will see at the bottom there is a link that says "Use via API". If you click on it in the Space I created ([dmaniloff/meeting-transcript-tool](https://huggingface.co/spaces/dmaniloff/meeting-transcript-tool/blob/main/app.py)), you will see the following:

<img src="https://github.com/dmaniloff/public-screenshots/blob/main/gradio-as-api.png?raw=true" width="750" alt="image description">

Let's go ahead and copy-paste that code below to use our Space as an API:

In [None]:
!pip install gradio_client

In [None]:
from gradio_client import Client, handle_file

client = Client("dmaniloff/meeting-transcript-tool")
result = client.predict(
		audio_input=handle_file('https://github.com/gradio-app/gradio/raw/main/test/test_files/audio_sample.wav'),
		api_name="/predict"
)
print(result)

Loaded as API: https://dmaniloff-meeting-transcript-tool.hf.space ‚úî
Certainly! Below is an organized version of a hypothetical meeting transcript. Since the original transcript you've provided is quite minimal, I'll create a more detailed and structured example featuring a meeting summary.

---

# Meeting Transcript: Project Alpha Kickoff

**Date:** April 7, 2023

**Location:** Conference Room B, TechCorp Headquarters


**Attendees:**

- John Smith (Project Manager)

- Emily Johnson (Lead Developer)

- Michael Brown (Marketing Lead)

- Lisa Green (Design Lead)


**Meeting Duration:** 1 hour 30 minutes


## Opening Remarks

**John Smith:**

Good morning everyone, and thank you for joining this kickoff meeting for Project Alpha. Today, we'll discuss our project vision, milestones, and roles. Let's get started.


## Vision and Goals

**Emily Johnson:**

The main goal of Project Alpha is to


Wow! What happened there? Let's break it down:

- We installed the `gradio_client`, which is a package that is specifically designed to interact with APIs built with Gradio.
- We instantiated the client by providing the name of the ü§ó Space that we want to query.
- We called the `predict` method of the client and passed in a sample audio file to it.

The Gradio client takes care of making the HTTP POST for us, and it also provides functionality like reading the input audio file that our meeting transcript tool will process (via the function `handle_file`).

Again, using this client is a choice, and you can just as well run a `curl -X POST https://dmaniloff-meeting-transcript-tool.hf.space/call/predict [...]` and pass in all the parameters needed in the request.

> [!TIP]
> The output that we get from the call above is a made-up meeting that was generated by the LLM that we are using for text organization. This is because the sample input file isn't an actual meeting recording. You can tweak the LLM's prompt to handle this case.

### Share via public URL
Another cool feature built into Gradio is that even if you build your demo on your local computer (before you move it into a ü§ó Space) you can still share this with anyone in the world by passing in `share=True` into `launch` like so:

```python
 demo.launch(share=True)
 ```

 You might have noticed that in this Google Colab environment that behaviour is enabled by default, and so the previous demos that we created already had a public URL that you can share üåé. Go back ‚¨Ü and look at the logs for `Running on public URL:` to find it üîé!

### Flagging
[Flagging](https://www.gradio.app/guides/using-flagging) is a feature built into Gradio that allows the users of your demo to provide feedback. You might have noticed that the first demo we created had a `Flag` button at the bottom.

Under the default options, if a user clicks that button then the input and output samples are saved into a CSV log file that you can review later. If the demo involves audio (like in our case), these are saved separately in a parallel directory and the paths to these files are saved in the CSV file.

Go back and play with our first demo once more, and then click the `Flag` button. You will see that a new log file is created in the `flagged` directory:

In [None]:
!cat flagged/log.csv

name,intensity,output,flag,username,timestamp
Diego,4,"Hello, Diego!!!!",,,2024-06-29 22:07:50.242707


In this case I set inputs to `name=diego` and `intensity=29`, which I then flagged. You can see that the log file includes the inputs to the function, the output `"Hello, diego!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"`, and also a timestamp.

While a list of inputs and outputs that your users found problematic is better than nothing, Gradio's flagging feature allows you to do much more. For example, you can provide a `flagging_options` parameter that lets you customize the kind of feedback or errors that you can receive, such as `["Incorrect", "Ambiguous"]`. Note that this requires that `allow_flagging` is set to `"manual"`:

In [None]:
demo_with_custom_flagging = gr.Interface(
    fn=greet,
    inputs=["text", "slider"], # the inputs are a text box and a slider ("text" and "slider" are components in Gradio)
    outputs=["text"],          # the output is a text box
    allow_flagging="manual",
    flagging_options=["Incorrect", "Ambiguous"],
)
demo_with_custom_flagging.launch()

Go ahead and try it out üëÜ! You can see that the flagging buttons now are `Flag as Incorrect` and `Flag as Ambiguous`, and the new log file will reflect those options:

In [None]:
!cat flagged/log.csv

name,intensity,output,flag,username,timestamp
Diego,4,"Hello, Diego!!!!",,,2024-06-29 22:07:50.242707
Diego,5,"Hello, Diego!!!!!",Ambiguous,,2024-06-29 22:08:04.281030


## Wrap up & Next Steps
In this notebook we learned how to demo any machine learning model using Gradio.

First, we learned the basics of setting up an interface for a simple Python function; and second, we covered Gradio's true strength: building demos for machine learning models.

For this, we learned how easy it is to leverage models in the ü§ó Hub via the `transformers` library and its `pipeline` function, and how to use multimedia inputs like `gr.Audio`.

Third, we covered how to host your Gradio demo on ü§ó Spaces, which lets you keep your demo running in the cloud and gives you flexibility in terms of the compute requirements for your demo.

Finally, we showcased a few of the super cool batteries included that come with Gradio such as API access, public URLs, and Flagging.

For next steps, check out the `Further Reading` links at the end of each section.

## ‚è≠Ô∏è Further reading
- [Your first demo with gradio](https://www.gradio.app/guides/quickstart#building-your-first-demo)
- [Gradio Components](https://www.gradio.app/docs/gradio/introduction)
- [The transformers library](https://huggingface.co/docs/transformers/en/index)
- [The pipeline function](https://huggingface.co/docs/transformers/en/main_classes/pipelines)
- [Hub Python Library](https://huggingface.co/docs/huggingface_hub/en/package_reference/login)
- [Serverless Inference API](https://huggingface.co/docs/api-inference/index)
- [ü§ó Spaces](https://huggingface.co/spaces)
- [Spaces documentation](https://huggingface.co/docs/hub/spaces)


# Task
Create a new code cell that imports `gradio` and `pandas`, defines a wrapper function to convert Gradio inputs to the formats expected by the `analyze_glp1_epigenetics_metabolism` function, and then creates and launches a Gradio interface. The Gradio inputs should be `gr.Dataframe` for metabolic data, `gr.Dataframe` for epigenetic data, and `gr.Textbox` for GLP-1 levels (expected as a comma-separated string of numbers). The output should be `gr.JSON`.

## Genera l'interfaccia Gradio

### Subtask:
Crea una nuova cella di codice che importa `gradio` e `pandas`, definisce una funzione wrapper per convertire gli input di Gradio nei formati attesi dalla funzione `analyze_glp1_epigenetics_metabolism`, quindi crea e lancia l'interfaccia Gradio. Gli input saranno `gr.Dataframe` per i dati metabolici ed epigenetici, e `gr.Textbox` per i livelli di GLP-1 (assumendo una stringa di numeri separati da virgole che verr√† parsata in una lista di float). L'output sar√† `gr.JSON`.


## Summary:

### Data Analysis Key Findings
*   A Gradio interface has been specified for the `analyze_glp1_epigenetics_metabolism` function, enabling interactive analysis of GLP-1 epigenetics and metabolism data.
*   The interface is designed to accept metabolic and epigenetic data as `gr.Dataframe` inputs.
*   GLP-1 levels will be provided as a comma-separated string via a `gr.Textbox`, which will then be parsed into a list of floats.
*   The output of the analysis will be displayed in a `gr.JSON` format.

### Insights or Next Steps
*   This Gradio interface provides a user-friendly way for researchers to interact with the `analyze_glp1_epigenetics_metabolism` function without needing to write code, potentially speeding up data exploration and hypothesis testing.
*   The next step involves populating the Gradio interface with actual metabolic and epigenetic data, along with GLP-1 levels, to test its functionality and validate the `analyze_glp1_epigenetics_metabolism` function.


# Task
The user wants to enhance the `analyze_glp1_epigenetics_metabolism` function to include advanced descriptive statistics and correlation analysis between GLP-1 levels, metabolic data, and numerical epigenetic markers. Additionally, the function should simulate a more sophisticated impact on longevity, and the `results` dictionary should be updated to reflect these new analyses. Finally, the Gradio interface should be updated to showcase these richer results.

## Revisionare il tipo di analisi

### Subtask:
Considerare quali tipi specifici di analisi (es. correlazione, regressione, test statistici, modelli predittivi) si desiderano implementare in base alla natura dei dati GLP-1, epigenetici e metabolici. Per ora, ci concentreremo su statistiche descrittive e correlazioni di base.


### Analisi Preliminare dei Dati e Strategie

#### 1. Natura dei Dati Forniti:
*   **metabolic_data**: Contiene 'glucose' e 'insulin'. Entrambe sono variabili numeriche e continue. 'glucose' pu√≤ indicare i livelli di zucchero nel sangue e 'insulin' la risposta ormonale.
*   **epigenetic_markers**: Contiene 'geneA_meth' (numerica, continua, probabilmente un livello di metilazione) e 'geneB_histone' (categorica, stringa, probabilmente lo stato di modificazione istonica come 'ac' per acetilazione o 'me' per metilazione). La variabile 'geneB_histone' richieder√† un trattamento specifico per l'analisi quantitativa (es. one-hot encoding se usata in modelli).
*   **glp1_levels**: Una lista di livelli di GLP-1, che sono numerici e continui.

#### 2. Tipi di Analisi Statistica Appropriati (Focalizzazione Attuale):

Per questo esercizio, come richiesto, ci concentreremo su:

*   **Statistiche Descrittive**: Essenziali per comprendere la distribuzione centrale, la dispersione e la forma di ciascuna variabile numerica. Questo include media, mediana, deviazione standard, minimo, massimo e quartili.
    *   Applicabile a: `glucose`, `insulin`, `geneA_meth`, `glp1_levels`.

*   **Analisi di Correlazione**: Per esplorare le relazioni lineari tra coppie di variabili numeriche. Il coefficiente di correlazione di Pearson sar√† appropriato per misurare la forza e la direzione di queste relazioni.
    *   Applicabile a: `glucose` vs `glp1_levels`, `insulin` vs `glp1_levels`, `geneA_meth` vs `glp1_levels`, `glucose` vs `insulin`, ecc.
    *   Per `geneB_histone` (categorica), saranno necessarie altre tecniche (es. ANOVA se confrontiamo GLP-1 tra gruppi istonici, o chi-quadro se si analizza la relazione con un'altra categorica), ma per ora la escludiamo dall'analisi di correlazione lineare diretta.

#### 3. Prossimi Passi:
Implementeremo le statistiche descrittive per tutte le variabili numeriche e calcoleremo le correlazioni di Pearson tra le variabili numeriche chiave per identificare potenziali associazioni.

**Reasoning**:
The previous markdown block outlined the plan for descriptive statistics and correlation analysis. To proceed with the implementation, it's necessary to define the dummy data and the `analyze_glp1_epigenetics_metabolism` function, which are both present in cell `eef85961` but not yet executed. Executing this cell will make these objects available in the environment.



In [1]:
import pandas as pd

def analyze_glp1_epigenetics_metabolism(metabolic_data: pd.DataFrame, epigenetic_markers: pd.DataFrame, glp1_levels: list) -> dict:
    """
    Placeholder function to simulate the analysis of GLP-1, epigenetics, and metabolism.
    In a real scenario, this would involve complex bioinformatics and statistical analysis.
    """
    print("Performing preliminary analysis...")

    # Example placeholder for metabolic and epigenetic correlation
    num_metabolic_samples = len(metabolic_data)
    num_epigenetic_samples = len(epigenetic_markers)
    avg_glp1 = sum(glp1_levels) / len(glp1_levels) if glp1_levels else 0

    # Simulate some findings
    results = {
        "message": "Analysis initiated. Further sophisticated models would be required here.",
        "metabolic_samples_processed": num_metabolic_samples,
        "epigenetic_markers_analyzed": num_epigenetic_samples,
        "average_glp1_level": f"{avg_glp1:.2f} pg/mL",
        "potential_longevity_impact": "Requires deeper longitudinal study"
    }

    return results

# Example usage (dummy data)
dummy_metabolic_data = pd.DataFrame({'glucose': [90, 110, 85], 'insulin': [5, 12, 4]})
dummy_epigenetic_markers = pd.DataFrame({'geneA_meth': [0.1, 0.3, 0.2], 'geneB_histone': ['ac', 'me', 'ac']})
dummy_glp1_levels = [15, 22, 18]

# Call the placeholder function
analysis_output = analyze_glp1_epigenetics_metabolism(
    metabolic_data=dummy_metabolic_data,
    epigenetic_markers=dummy_epigenetic_markers,
    glp1_levels=dummy_glp1_levels
)
print(analysis_output)

Performing preliminary analysis...
{'message': 'Analysis initiated. Further sophisticated models would be required here.', 'metabolic_samples_processed': 3, 'epigenetic_markers_analyzed': 3, 'average_glp1_level': '18.33 pg/mL', 'potential_longevity_impact': 'Requires deeper longitudinal study'}


**Reasoning**:
To implement the descriptive statistics and correlation analysis as outlined in the strategy, I will modify the `analyze_glp1_epigenetics_metabolism` function to include these calculations and update the results dictionary accordingly.



In [2]:
import pandas as pd
import numpy as np

def analyze_glp1_epigenetics_metabolism(metabolic_data: pd.DataFrame, epigenetic_markers: pd.DataFrame, glp1_levels: list) -> dict:
    """
    Analyzes GLP-1 levels, metabolic data, and epigenetic markers, including
    descriptive statistics and basic correlation analysis.
    """
    print("Performing advanced analysis: descriptive statistics and correlations...")

    # 1. Descriptive Statistics
    descriptive_stats = {}

    # Metabolic Data
    if not metabolic_data.empty:
        for col in ['glucose', 'insulin']:
            if col in metabolic_data.columns:
                descriptive_stats[f'metabolic_{col}_stats'] = metabolic_data[col].describe().to_dict()

    # Epigenetic Markers (numerical)
    if not epigenetic_markers.empty:
        for col in ['geneA_meth']:
            if col in epigenetic_markers.columns:
                descriptive_stats[f'epigenetic_{col}_stats'] = epigenetic_markers[col].describe().to_dict()

    # GLP-1 Levels
    if glp1_levels:
        glp1_series = pd.Series(glp1_levels)
        descriptive_stats['glp1_levels_stats'] = glp1_series.describe().to_dict()

    # 2. Correlation Analysis
    correlations = {}
    # Combine all numerical data into a single DataFrame for correlation
    combined_data = pd.DataFrame({'glp1_level': glp1_levels})
    if not metabolic_data.empty:
        combined_data = pd.concat([combined_data, metabolic_data[['glucose', 'insulin']]], axis=1)
    if not epigenetic_markers.empty:
        combined_data = pd.concat([combined_data, epigenetic_markers[['geneA_meth']]], axis=1)

    # Ensure all columns are numeric, coerce errors to NaN
    combined_data = combined_data.apply(pd.to_numeric, errors='coerce')

    # Drop rows with any NaN values that would prevent correlation calculation
    combined_data = combined_data.dropna()

    if not combined_data.empty and len(combined_data.columns) > 1:
        correlation_matrix = combined_data.corr(method='pearson')
        correlations['pearson_correlation_matrix'] = correlation_matrix.to_dict()
    else:
        correlations['pearson_correlation_matrix'] = "Not enough numerical data for correlation analysis."

    # Simulate some findings based on new analysis
    num_metabolic_samples = len(metabolic_data)
    num_epigenetic_samples = len(epigenetic_markers)
    avg_glp1 = sum(glp1_levels) / len(glp1_levels) if glp1_levels else 0

    results = {
        "message": "Advanced analysis completed, including descriptive statistics and correlations.",
        "metabolic_samples_processed": num_metabolic_samples,
        "epigenetic_markers_analyzed": num_epigenetic_samples,
        "average_glp1_level": f"{avg_glp1:.2f} pg/mL",
        "descriptive_statistics": descriptive_stats,
        "correlation_analysis": correlations,
        "potential_longevity_impact": "Further interpretation needed based on statistical significance and biological context."
    }

    return results

# Example usage with updated dummy data to ensure all columns are available
dummy_metabolic_data = pd.DataFrame({'glucose': [90, 110, 85, 95], 'insulin': [5, 12, 4, 7]})
dummy_epigenetic_markers = pd.DataFrame({'geneA_meth': [0.1, 0.3, 0.2, 0.25], 'geneB_histone': ['ac', 'me', 'ac', 'me']})
dummy_glp1_levels = [15, 22, 18, 20]

# Call the enhanced function
analysis_output_enhanced = analyze_glp1_epigenetics_metabolism(
    metabolic_data=dummy_metabolic_data,
    epigenetic_markers=dummy_epigenetic_markers,
    glp1_levels=dummy_glp1_levels
)
print(analysis_output_enhanced)


Performing advanced analysis: descriptive statistics and correlations...
{'message': 'Advanced analysis completed, including descriptive statistics and correlations.', 'metabolic_samples_processed': 4, 'epigenetic_markers_analyzed': 4, 'average_glp1_level': '18.75 pg/mL', 'descriptive_statistics': {'metabolic_glucose_stats': {'count': 4.0, 'mean': 95.0, 'std': 10.801234497346433, 'min': 85.0, '25%': 88.75, '50%': 92.5, '75%': 98.75, 'max': 110.0}, 'metabolic_insulin_stats': {'count': 4.0, 'mean': 7.0, 'std': 3.559026084010437, 'min': 4.0, '25%': 4.75, '50%': 6.0, '75%': 8.25, 'max': 12.0}, 'epigenetic_geneA_meth_stats': {'count': 4.0, 'mean': 0.21250000000000002, 'std': 0.08539125638299665, 'min': 0.1, '25%': 0.17500000000000002, '50%': 0.225, '75%': 0.2625, 'max': 0.3}, 'glp1_levels_stats': {'count': 4.0, 'mean': 18.75, 'std': 2.9860788111948193, 'min': 15.0, '25%': 17.25, '50%': 19.0, '75%': 20.5, 'max': 22.0}}, 'correlation_analysis': {'pearson_correlation_matrix': {'glp1_level': {'

**Reasoning**:
The subtask requires updating the Gradio interface to reflect the enhanced `analyze_glp1_epigenetics_metabolism` function. The existing `gradio_analyze_glp1` wrapper function and the `gr.Interface` setup in cell `7e7c49c3` are already designed to handle `pd.DataFrame` inputs for metabolic and epigenetic data and a comma-separated string for GLP-1 levels, outputting a JSON. Since the `analyze_glp1_epigenetics_metabolism` function now returns a richer dictionary, `gr.JSON` is perfectly suitable for displaying these enhanced results. Therefore, I will regenerate the Gradio interface code to ensure it uses the newly defined and enhanced analysis function and its wrapper.



In [3]:
import gradio as gr
import pandas as pd

def gradio_analyze_glp1(metabolic_data_input: list, epigenetic_markers_input: list, glp1_levels_str: str) -> dict:
    """
    Wrapper function to convert Gradio inputs to the formats expected by
    analyze_glp1_epigenetics_metabolism and call it.
    """
    # Convert list of lists (from gr.Dataframe) to pandas DataFrames
    # Assuming the first row is headers for now, if not, adjust logic.
    try:
        if metabolic_data_input and len(metabolic_data_input) > 1:
            metabolic_data_df = pd.DataFrame(metabolic_data_input[1:], columns=metabolic_data_input[0])
        else:
            metabolic_data_df = pd.DataFrame(columns=["glucose", "insulin"]) # Ensure columns even if empty

        if epigenetic_markers_input and len(epigenetic_markers_input) > 1:
            epigenetic_markers_df = pd.DataFrame(epigenetic_markers_input[1:], columns=epigenetic_markers_input[0])
        else:
            epigenetic_markers_df = pd.DataFrame(columns=["geneA_meth", "geneB_histone"]) # Ensure columns even if empty

        # Convert comma-separated string of GLP-1 levels to a list of floats
        glp1_levels_list = [float(level.strip()) for level in glp1_levels_str.split(',') if level.strip()]

        return analyze_glp1_epigenetics_metabolism(
            metabolic_data=metabolic_data_df,
            epigenetic_markers=epigenetic_markers_df,
            glp1_levels=glp1_levels_list
        )
    except ValueError as e:
        raise gr.Error(f"Errore nella conversione degli input: {e}. Assicurati che i dati siano nel formato corretto.")
    except Exception as e:
        raise gr.Error(f"Si √® verificato un errore: {e}")

# Create Gradio Interface
glp1_demo = gr.Interface(
    fn=gradio_analyze_glp1,
    inputs=[
        gr.Dataframe(headers=["glucose", "insulin"], type="array", label="Dati Metabolici", value=dummy_metabolic_data.values.tolist()),
        gr.Dataframe(headers=["geneA_meth", "geneB_histone"], type="array", label="Marcatori Epigenetici", value=dummy_epigenetic_markers.values.tolist()),
        gr.Textbox(label="Livelli GLP-1 (numeri separati da virgola)", value=", ".join(map(str, dummy_glp1_levels)))
    ],
    outputs=gr.JSON(label="Risultati dell'Analisi"),
    title="Analisi GLP-1, Epigenetica e Metabolismo Avanzata",
    description="Inserisci i dati metabolici, i marcatori epigenetici e i livelli di GLP-1 per avviare l'analisi avanzata (statistiche descrittive e correlazioni)."
)

# Launch the demo
glp1_demo.launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://f344d4a27140df81e2.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




## Simulare un impatto sulla longevit√†

### Subtask:
Simulate a more sophisticated impact on longevity within the `analyze_glp1_epigenetics_metabolism` function, based on the results of the descriptive statistics and correlation analysis.


**Reasoning**:
The subtask requires simulating a more sophisticated longevity impact within the `analyze_glp1_epigenetics_metabolism` function, using the descriptive statistics and correlation analysis results. I will modify the function to calculate a simulated longevity impact score based on average GLP-1 levels and key correlations, and then update the `results` dictionary to reflect this new analysis.



In [4]:
import pandas as pd
import numpy as np

def analyze_glp1_epigenetics_metabolism(metabolic_data: pd.DataFrame, epigenetic_markers: pd.DataFrame, glp1_levels: list) -> dict:
    """
    Analyzes GLP-1 levels, metabolic data, and epigenetic markers, including
    descriptive statistics, basic correlation analysis, and a simulated longevity impact.
    """
    print("Performing advanced analysis: descriptive statistics, correlations, and longevity impact simulation...")

    # 1. Descriptive Statistics
    descriptive_stats = {}

    # Metabolic Data
    if not metabolic_data.empty:
        for col in ['glucose', 'insulin']:
            if col in metabolic_data.columns:
                descriptive_stats[f'metabolic_{col}_stats'] = metabolic_data[col].describe().to_dict()

    # Epigenetic Markers (numerical)
    if not epigenetic_markers.empty:
        for col in ['geneA_meth']:
            if col in epigenetic_markers.columns:
                descriptive_stats[f'epigenetic_{col}_stats'] = epigenetic_markers[col].describe().to_dict()

    # GLP-1 Levels
    if glp1_levels:
        glp1_series = pd.Series(glp1_levels)
        descriptive_stats['glp1_levels_stats'] = glp1_series.describe().to_dict()

    # 2. Correlation Analysis
    correlations = {}
    # Combine all numerical data into a single DataFrame for correlation
    combined_data = pd.DataFrame({'glp1_level': glp1_levels})
    if not metabolic_data.empty:
        combined_data = pd.concat([combined_data, metabolic_data[['glucose', 'insulin']]], axis=1)
    if not epigenetic_markers.empty:
        combined_data = pd.concat([combined_data, epigenetic_markers[['geneA_meth']]], axis=1)

    # Ensure all columns are numeric, coerce errors to NaN
    combined_data = combined_data.apply(pd.to_numeric, errors='coerce')

    # Drop rows with any NaN values that would prevent correlation calculation
    combined_data = combined_data.dropna()

    correlation_matrix_dict = "Not enough numerical data for correlation analysis."
    if not combined_data.empty and len(combined_data.columns) > 1:
        correlation_matrix = combined_data.corr(method='pearson')
        correlation_matrix_dict = correlation_matrix.to_dict()
    correlations['pearson_correlation_matrix'] = correlation_matrix_dict

    # 3. Simulate Longevity Impact
    longevity_impact_assessment = "No clear longevity impact assessment due to insufficient data or complex interactions."
    if glp1_levels and not combined_data.empty and 'glp1_level' in combined_data.columns:
        avg_glp1 = descriptive_stats.get('glp1_levels_stats', {}).get('mean', 0)
        # Define thresholds and weights for impact assessment (simplified for simulation)
        glp1_threshold_good = 18.0 # Example: GLP-1 above this is generally good
        glp1_threshold_excellent = 20.0

        longevity_score = 0
        assessment_factors = []

        # Factor 1: Average GLP-1 Level
        if avg_glp1 > glp1_threshold_excellent:
            longevity_score += 2
            assessment_factors.append(f"Excellent average GLP-1 level ({avg_glp1:.2f} pg/mL).")
        elif avg_glp1 > glp1_threshold_good:
            longevity_score += 1
            assessment_factors.append(f"Good average GLP-1 level ({avg_glp1:.2f} pg/mL).")
        else:
            longevity_score -= 1
            assessment_factors.append(f"Suboptimal average GLP-1 level ({avg_glp1:.2f} pg/mL).")

        # Factor 2: Correlations (if available and meaningful)
        if isinstance(correlation_matrix_dict, dict):
            # Assuming 'geneA_meth' has a positive impact if correlated with GLP-1
            glp1_geneA_meth_corr = correlation_matrix_dict.get('glp1_level', {}).get('geneA_meth', 0)
            if glp1_geneA_meth_corr > 0.5: # Strong positive correlation
                longevity_score += 1
                assessment_factors.append(f"Strong positive correlation between GLP-1 and geneA_meth ({glp1_geneA_meth_corr:.2f}), suggesting beneficial epigenetic regulation.")
            elif glp1_geneA_meth_corr < -0.5: # Strong negative correlation
                longevity_score -= 1
                assessment_factors.append(f"Strong negative correlation between GLP-1 and geneA_meth ({glp1_geneA_meth_corr:.2f}), potentially unfavorable epigenetic regulation.")

            # Assuming 'glucose' correlation with GLP-1: high positive correlation might imply resistance or compensatory mechanisms
            glp1_glucose_corr = correlation_matrix_dict.get('glp1_level', {}).get('glucose', 0)
            if glp1_glucose_corr > 0.6: # Strong positive correlation
                longevity_score -= 1
                assessment_factors.append(f"Strong positive correlation between GLP-1 and glucose ({glp1_glucose_corr:.2f}), which could indicate metabolic stress or resistance.")

        # Final Longevity Assessment based on score
        if longevity_score >= 2:
            longevity_impact_assessment = "Potentially positive impact on longevity, supported by favorable GLP-1 levels and beneficial correlations." + "\nFactors: " + "; ".join(assessment_factors)
        elif longevity_score == 1:
            longevity_impact_assessment = "Likely neutral to slightly positive impact on longevity, with some favorable indicators." + "\nFactors: " + "; ".join(assessment_factors)
        else:
            longevity_impact_assessment = "Impact on longevity is unclear or potentially negative, requiring further investigation." + "\nFactors: " + "; ".join(assessment_factors)


    # Simulate some findings based on new analysis
    num_metabolic_samples = len(metabolic_data)
    num_epigenetic_samples = len(epigenetic_markers)
    avg_glp1_display = descriptive_stats.get('glp1_levels_stats', {}).get('mean', 0)

    results = {
        "message": "Advanced analysis completed, including descriptive statistics, correlations, and simulated longevity impact.",
        "metabolic_samples_processed": num_metabolic_samples,
        "epigenetic_markers_analyzed": num_epigenetic_samples,
        "average_glp1_level": f"{avg_glp1_display:.2f} pg/mL",
        "descriptive_statistics": descriptive_stats,
        "correlation_analysis": correlations,
        "simulated_longevity_impact": longevity_impact_assessment
    }

    return results

# Example usage with updated dummy data to ensure all columns are available
dummy_metabolic_data = pd.DataFrame({'glucose': [90, 110, 85, 95], 'insulin': [5, 12, 4, 7]}) # Added one more row for consistency
dummy_epigenetic_markers = pd.DataFrame({'geneA_meth': [0.1, 0.3, 0.2, 0.25], 'geneB_histone': ['ac', 'me', 'ac', 'me']}) # Added one more row
dummy_glp1_levels = [15, 22, 18, 20] # Added one more value

# Call the enhanced function
analysis_output_enhanced_longevity = analyze_glp1_epigenetics_metabolism(
    metabolic_data=dummy_metabolic_data,
    epigenetic_markers=dummy_epigenetic_markers,
    glp1_levels=dummy_glp1_levels
)
print(analysis_output_enhanced_longevity)

# Test with different dummy data for a different impact
dummy_metabolic_data_low_glp1 = pd.DataFrame({'glucose': [120, 130, 115, 125], 'insulin': [15, 20, 14, 18]})
dummy_epigenetic_markers_low_glp1 = pd.DataFrame({'geneA_meth': [0.05, 0.08, 0.07, 0.06], 'geneB_histone': ['ac', 'me', 'ac', 'me']})
dummy_glp1_levels_low = [10, 11, 9, 12]

analysis_output_low_glp1 = analyze_glp1_epigenetics_metabolism(
    metabolic_data=dummy_metabolic_data_low_glp1,
    epigenetic_markers=dummy_epigenetic_markers_low_glp1,
    glp1_levels=dummy_glp1_levels_low
)
print("\n--- Analysis with Low GLP-1 and unfavorable correlations ---")
print(analysis_output_low_glp1)

Performing advanced analysis: descriptive statistics, correlations, and longevity impact simulation...
{'message': 'Advanced analysis completed, including descriptive statistics, correlations, and simulated longevity impact.', 'metabolic_samples_processed': 4, 'epigenetic_markers_analyzed': 4, 'average_glp1_level': '18.75 pg/mL', 'descriptive_statistics': {'metabolic_glucose_stats': {'count': 4.0, 'mean': 95.0, 'std': 10.801234497346433, 'min': 85.0, '25%': 88.75, '50%': 92.5, '75%': 98.75, 'max': 110.0}, 'metabolic_insulin_stats': {'count': 4.0, 'mean': 7.0, 'std': 3.559026084010437, 'min': 4.0, '25%': 4.75, '50%': 6.0, '75%': 8.25, 'max': 12.0}, 'epigenetic_geneA_meth_stats': {'count': 4.0, 'mean': 0.21250000000000002, 'std': 0.08539125638299665, 'min': 0.1, '25%': 0.17500000000000002, '50%': 0.225, '75%': 0.2625, 'max': 0.3}, 'glp1_levels_stats': {'count': 4.0, 'mean': 18.75, 'std': 2.9860788111948193, 'min': 15.0, '25%': 17.25, '50%': 19.0, '75%': 20.5, 'max': 22.0}}, 'correlation_

## Final Task

### Subtask:
Riepilogare le modifiche apportate alla funzione `analyze_glp1_epigenetics_metabolism` per renderla pi√π realistica e discutere i risultati simulati.


## Summary:

### Data Analysis Key Findings

*   The `analyze_glp1_epigenetics_metabolism` function was significantly enhanced to provide a more realistic analysis.
*   **Descriptive Statistics**: The function now computes detailed descriptive statistics (mean, median, standard deviation, min, max, quartiles) for all numerical variables: `glucose`, `insulin`, `geneA_meth`, and `glp1_levels`.
*   **Correlation Analysis**: A Pearson correlation matrix is now generated to show linear relationships between `glp1_level`, `glucose`, `insulin`, and `geneA_meth`. Categorical epigenetic markers (`geneB_histone`) were identified as needing different analysis methods and were excluded from this linear correlation.
*   **Sophisticated Longevity Impact Simulation**: A new simulation was added to assess potential longevity impact, incorporating:
    *   **Average GLP-1 Level**: Levels above 18.0 pg/mL (good) or 20.0 pg/mL (excellent) positively influence the longevity score, while suboptimal levels contribute negatively. For example, an average GLP-1 of 18.75 pg/mL indicated a "Good average GLP-1 level," whereas 10.50 pg/mL was "Suboptimal."
    *   **GLP-1 vs. Epigenetic Marker Correlation**: A strong positive correlation ($>$0.5) between GLP-1 and `geneA_meth` is interpreted as a beneficial epigenetic regulation, adding to a positive longevity score.
    *   **GLP-1 vs. Metabolic Data Correlation**: A strong positive correlation ($>$0.6) between GLP-1 and `glucose` is considered indicative of metabolic stress or resistance, negatively impacting the longevity score.
*   **Gradio Interface Update**: The Gradio interface was updated to accept `metabolic_data` and `epigenetic_markers` as dataframes and `glp1_levels` as a comma-separated string, outputting the comprehensive results, including descriptive statistics, correlation analysis, and the simulated longevity impact, in a JSON format.

### Insights or Next Steps

*   The current longevity simulation is a simplified model. Future enhancements could involve implementing more complex predictive models (e.g., machine learning) that weigh various factors (metabolic, epigenetic, GLP-1) and their interactions based on established biological pathways or longitudinal study data for a more accurate longevity forecast.
*   The categorical epigenetic marker `geneB_histone` is currently excluded from correlation analysis. Future work should incorporate appropriate statistical methods (e.g., ANOVA, chi-square tests, or encoding for regression models) to understand its relationship with GLP-1 levels and metabolic health.
