<a href="https://colab.research.google.com/github/octopus2023-inc/gensphere/blob/main/GenSphere_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **GenSphere tutorial**

This quick tutorial will walk you through the main functionalities of [GenSphere](https://github.com/octopus2023-inc/gensphere).

We will follow a guided example, **where we create a workflow that finds what are the latest product releases at [producthunt.com](http//producthunt.com), searches for traction information like revenue, number of users, and analyzes a new startup idea based on that.**

By completing this tutorial, you will learn about the main functionalities of GenSphere, such as:


1.   Defining workflows with yaml files;
2.   Pulling from the platform;
3.   Nesting workflows;
4.   Using custom functions and schemas, as well as using langchain and composio tools;
5.   Visualizing workflows;
6.   Pushing to the platform.







### **0. Install GenSphere and other libs to be used**

In [1]:
!pip install fastapi pandas

Collecting fastapi
  Downloading fastapi-0.115.4-py3-none-any.whl.metadata (27 kB)
Collecting starlette<0.42.0,>=0.40.0 (from fastapi)
  Downloading starlette-0.41.2-py3-none-any.whl.metadata (6.0 kB)
Downloading fastapi-0.115.4-py3-none-any.whl (94 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.7/94.7 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading starlette-0.41.2-py3-none-any.whl (73 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.3/73.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: starlette, fastapi
Successfully installed fastapi-0.115.4 starlette-0.41.2


In [2]:
!pip install --extra-index-url https://test.pypi.org/simple/ gensphere==0.1.8

Looking in indexes: https://pypi.org/simple, https://test.pypi.org/simple/
Collecting gensphere==0.1.8
  Downloading https://test-files.pythonhosted.org/packages/28/11/896662778f2fff0377c73f9955333ba13d374177048ee02f643cb02a84d4/gensphere-0.1.8-py3-none-any.whl.metadata (7.6 kB)
Collecting composio-core<0.6.0,>=0.5.35 (from gensphere==0.1.8)
  Downloading composio_core-0.5.37-py3-none-any.whl.metadata (14 kB)
Collecting composio-openai<0.6.0,>=0.5.35 (from gensphere==0.1.8)
  Downloading composio_openai-0.5.37-py3-none-any.whl.metadata (2.6 kB)
Collecting dash<3.0.0,>=2.18.1 (from gensphere==0.1.8)
  Downloading dash-2.18.1-py3-none-any.whl.metadata (10 kB)
Collecting dash-cytoscape<2.0.0,>=1.0.2 (from gensphere==0.1.8)
  Downloading dash_cytoscape-1.0.2.tar.gz (4.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.0/4.0 MB[0m [31m41.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting langchain-community<0.4.0,>=0.

### **1. Import GenSphere**

In [3]:
import logging
import traceback


# Set up logging configuration before importing other modules
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("../../app.log", mode='w'),
        logging.StreamHandler()
    ]
)

In [4]:
from gensphere import genflow, yaml_utils
from gensphere.genflow import GenFlow
from gensphere.yaml_utils import YamlCompose
from gensphere.visualizer import Visualizer
from gensphere.hub import Hub
import dotenv
from dotenv import load_dotenv
import os

### **2. Define your enviroment variables**

Replace these env variables with your corresponding API key.

In [5]:
os.environ['OPENAI_API_KEY']="PLACE-YOUR-OPENAI-API-KEY"
os.environ['COMPOSIO_API_KEY']="PLACE-YOUR-COMPOSIO-API-KEY" #if you don't have one, visit composio.dev
os.environ['FIRECRAWL_API_KEY']="PLACE-YOUR-FIRECRAWL-API-KEY" # if you don't have one, visit firecrawl.dev

  and should_run_async(code)


In [None]:
!composio add firecrawl #for this project, we will be using firecrawl. Get an API key and add it by following the steps here.


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.




[32m> Adding integration: Firecrawl[0m[32m...[0m

> Enter API Key: fc-02f456f0cc524af393002a0514c4ed73
> Enter Base URL (Optional):
> Enter API Key: fc-02f456f0cc524af393002a0514c4ed73
> Enter Base URL (Optional):
✔ firecrawl added successfully with ID: 891f8bc7-64b5-4a84-91e9-d1398aa9b9cf


### **3. Define your workflow with a yaml file.**

Our aim is to create workflow that automatically finds **latest product releases from producthunt, explores their revenue and traction, and analyzes a new startup idea based on that**. We will use pre-built components from the platform to accelerate our development.

#### **GenSphere project structure**

There are 3 fundamental files in a GenSphere project.


1.   **Yaml file** - contains the workflow definition
2.   **Functions file** - .py file containing all functions to be used, either as nodes in the graph or as tools during LLM function calling
3.   **Schemas file** - .py file containing pydantic schemas. These are used when you want to use structured outputs from openAI.



#### **3.1 Pull a base yaml file from the platform**

In this tutorial, we will use a **pre-built workflow from our open platform** that extracts information from producthunt.com. We will nest that into a bigger workflow to achieve our objective of analyzing a new startup idea.






In [6]:
path_to_save_yaml_file='product_hunt_analyzer.yaml'
path_to_save_functions_file='gensphere_functions.py'
path_to_save_schema_file='structured_output_schema.py'

hub=Hub()
hub.pull(push_id='de8afbeb-06cb-4f8f-8ead-64d9e6ef5326',
         yaml_filename=path_to_save_yaml_file,
         functions_filename=path_to_save_functions_file,
         schema_filename=path_to_save_schema_file,
         save_to_disk=True)

  and should_run_async(code)


{'yaml_file.yaml': '# product_hunt_analyzer.yaml\r\n\r\nnodes:      \r\n  - name: get_current_date\r\n    type: function_call\r\n    function: get_current_date_function\r\n    outputs:\r\n      - current_date\r\n      \r\n  - name: get_timewindow\r\n    type: function_call\r\n    function: get_timewindow_function\r\n    outputs:\r\n      - time_window\r\n  \r\n  - name: product_hunt_scrape\r\n    type: llm_service\r\n    service: openai\r\n    model: "gpt-4o-2024-08-06"\r\n    tools:\r\n      - COMPOSIO.FIRECRAWL_SCRAPE\r\n    params:\r\n      prompt: |\r\n         You should visit producthunt at https://www.producthunt.com/leaderboard/monthly/yyyy/mm \r\n         Today is {{ get_current_date.current_date }}\r\n         You should subsitute yyyy and mm by year and month you want to search.\r\n         The search time window should be {{ get_timewindow.time_window }}.\r\n         After that, you should extract raw content from the htmls associated,\r\n         which will contain informa

The yaml file has been saved locally as **"product_hunt_analyzer.yaml"**. We also saved the functions and schema files as **gensphere_functions.py** and **structured_output_schema.py**. Here are the full contents of the yaml file:



```
# product_hunt_analyzer.yaml

nodes:      

  - name: get_current_date
    type: function_call
    function: get_current_date_function
    outputs:
      - current_date

  - name: get_timewindow

    type: function_call
    function: get_timewindow_function
    outputs:
      - time_window

  - name: product_hunt_scrape
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    tools:
      - COMPOSIO.FIRECRAWL_SCRAPE
    params:
      prompt: |

         You should visit producthunt at https://www.producthunt.com/leaderboard/monthly/yyyy/mm
         Today is {{ get_current_date.current_date }}
         You should subsitute yyyy and mm by year and month you want to search.
         The search time window should be {{ get_timewindow.time_window }}.
         After that, you should extract raw content from the htmls associated,
         which will contain information about new product launches, their companies, number of upvotes, etc.
         Scroll the page until the end and wait a few miliseconds for it to launch before scraping.

    outputs:
      - product_hunt_scrape_results    

      
  - name: extract_info_from_search
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    structured_output_schema: StartupInformationList
    params:
      prompt: |

         You are given reports from a search to https://www.producthunt.com/leaderboard/monthly/, containing
         products featured there last month:
         {{ product_hunt_scrape.product_hunt_scrape_results }}.
         We want to extract accurate information about these new product launches.
         Structure the information there by the following dimensions:  product name, company name, company url, number of upvotes, business model
         brief description of it.
    outputs:

      - structured_search_info

  - name: postprocess_search_results
    type: function_call
    function: postprocess_search_results_functions
    params:
      info: '{{ extract_info_from_search.structured_search_info }}'
    outputs:
      - postprocessed_search_results

      

  - name: find_extra_info
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    tools:

      - COMPOSIO.TAVILY_TAVILY_SEARCH

    params:
      prompt: |

         You should conduct a comprehensive search on the web about the following entry from producthunt.com:
         {{ postprocess_search_results.postprocessed_search_results[i] }}. You should look to find relevant news
         about the company, specially related to its revenue, valuation, traction, acquisition if applicable, number of users, etc.

    outputs:
      - startup_extra_info
```



#### **3.2 Visualize your project**

Let's analyze the project we just pulled by using the visualizer class. **You can zoom in and out, and by clicking on a node, you can see all functions and schemas, inputs and outputs associated with it.**

**OBS**: It is slightly cumbersome to visualize the graph from inside google colab. If you run locally, you can determine the address where the visualization will be run and access it through your browser.

In [7]:
viz=Visualizer('product_hunt_analyzer.yaml',
               'gensphere_functions.py',
               'structured_output_schema.py',
               address='127.0.0.1', port=8050)
viz.start_visualization()

  and should_run_async(code)
INFO:gensphere.graph_builder:Total elements generated: 11

Dash.run_server is deprecated and will be removed in Dash 3.0



<IPython.core.display.Javascript object>

#### **3.3 Understand the syntax of the yaml file**


Let's understand how to work with YAML files step by step. **There are 3 types of node types: function_call, llm_service and yml_flow.**

##### 3.3.1 **function_call nodes**

function_call nodes trigger function execution defined on a .py file (which you will pass when triggering execution). They have a **params field and output fields**.

For instance, have a look at the get_current_date node:



```
  - name: get_current_date
    type: function_call
    function: get_current_date_function
    outputs:
      - current_date
```
Here we are instructing GenSphere to execute the function get_current_date_function, and in this case there are no 'params'. This functions is defined on gensphere_functions.py which was pulled from the platform together with the yaml file.



```
# gensphere_functions.py

import datetime

def get_current_date_function():
    return {'current_date':datetime.today().strftime('%Y-%m-%d')}
```



**Important notes**:  

1.   If you want to use other nodes outputs as inputs, you can reference them with the syntax **{{ node name.output_name }}** in the 'params' field of the node.
2.   **Functions output must be a dict**, whose keys must match the outputs defined in the yaml file.





##### **3.3.2 llm_service nodes**

These nodes execute LLM API calls. In the current version, we only support openAI, including [structured outputs](https://https://openai.com/index/introducing-structured-outputs-in-the-api/) and [function calling](https://https://platform.openai.com/docs/guides/function-calling). For instance, have a look at the node product_hunt_scrape:



```
#yaml_file.yaml

   - name: product_hunt_scrape
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    tools:
      - COMPOSIO.FIRECRAWL_SCRAPE
    params:
      prompt: |
         You should visit producthunt at https://www.producthunt.com/leaderboard/monthly/yyyy/mm
         Today is {{ get_current_date.current_date }}
         You should subsitute yyyy and mm by year and month you want to search.
         The search time window should be {{ get_timewindow.time_window }}.
         After that, you should extract raw content from the htmls associated,
         which will contain information about new product launches, their companies, number of upvotes, etc.
         Scroll the page until the end and wait a few miliseconds for it to launch before scraping.
    outputs:
      - product_hunt_scrape_results
```
The **tools** field can refer to any function on your .py that defines functions.

**You can also use [Composio](https://composio.dev/) tools**, with the syntax "COMPOSIO.composio_tool_name". Check [Composio's documentation](https://app.composio.dev/sdk_guide) for a detailed view on all available tools.

**We also support [Langchain](https://langchain.com) tools**. You can use any tool from [langchain_community.tools](https://python.langchain.com/api_reference/community/tools.html) with the syntax "LANGCHAIN.langchain_tool_name"

If you want your output from openAI to be a dict with a predetermined schema, **you can use the structured_output_schema field**, as in the node 'extract_info_from_search':

```
#product_hunt_analyzer.yaml

  - name: extract_info_from_search
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    structured_output_schema: StartupInformationList
    params:
      prompt: |
         You are given reports from a search to https://www.producthunt.com/leaderboard/monthly/, containing
         products featured there in the following time window: {{ get_timewindow.time_window }}. Here
         is the content of the search:
         {{ product_hunt_scrape.product_hunt_scrape_results }}.
         We want to extract accurate information about these new product launches.
         Structure the information there by the following dimensions:  product name, company name, company url, number of upvotes, business model
         brief description of it.
    outputs:
      - structured_search_info
```






The output will be an instance of the class **StartupInformationList**, which is defined on **structured_output_schemas.py** as



```
# structured_output_schema.py

from pydantic import BaseModel, Field
from typing import List

class StartupInformation(BaseModel):
    product_name: str = Field(..., description="The name of the product")
    company_name: str = Field(..., description="The name of the company that offers the product. Could be equal to name of the product")
    url: str = Field(..., description="URL associated with the product.")
    number_upvotes: int = Field(..., description="Number of upvotes associated with the product")
    business_model: str = Field(..., description="A brief description about the business model of the product or company")
    brief_description: str = Field(..., description="A brief description about the product")

class StartupInformationList(BaseModel):
    information_list:List[StartupInformation]
```

The output of nodes with structured_output_schema are instances of the class defined on the schemas file (structured_output_schema.py in our case). To reference this output on other nodes, it is useful to introduce an additional post-processing node to extract information we want from the class instance. That's the purpose of the postprocess_search_results node:

```
  - name: postprocess_search_results
    type: function_call
    function: postprocess_search_results_functions
    params:
      info: '{{ extract_info_from_search.structured_search_info }}'
    outputs:
      - postprocessed_search_results
```

which applies the function **postprocess_search_results_functions**, defined on **gensphere_functions.py**.

```
def postprocess_search_results_functions(info):
    result=info.model_dump().get('information_list')
    return {'postprocessed_search_results':result}
```







##### **3.3.3 yml_flow nodes**

These nodes represent entire yaml files themselves. So, you can easily nest workflows by referencing other yaml files here. We will see an example of yaml file that contains yaml nodes below. For now, have a look at an example of a yml_flow node



```
- name: example_node_name
    type: yml_flow
    yml_file: path-to-yaml-file
    params:
      yml_flow_argument_example: 'xyz'
    outputs:
      - yml_flow_output_example
```

when referecing yml_flow nodes inside your yaml file, GenSphere will handle dependencies and **compose a combined yaml file** that is ready to run.



##### **3.3.4 Working with lists**

Many times, the output of a node will be a python list and we will want to apply the next node to each individual element of the list.

You can easily accomplish this with the syntax by **appending [i] after a node reference, as in {{node_name.output_name[i] }}**. If a node references that (either in its 'params' field or in the 'prompt' field for llm_service nodes), **GenSphere will execute the node to each element of the iterable and collect output as a list**. For instance, lets examine the node 'find_extra_info':



```
  - name: find_extra_info
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    tools:
      - COMPOSIO.TAVILY_TAVILY_SEARCH
    params:
      prompt: |
         You should conduct a comprehensive search on the web about the following entry from producthunt.com:
         {{ postprocess_search_results.postprocessed_search_results[i] }}. You should look to find relevant news
         about the company, specially related to its revenue, valuation, traction, acquisition if applicable,
         number of users, etc.
    outputs:
      - startup_extra_info
```

Notice that the 'prompt' field references the output of the node 'postprocess_search_results' as:

```
{{ postprocess_search_results.postprocessed_search_results[i] }}
```

That means this GenSphere will take each element of  "postprocessed_search_results" (which is a list, as defined by its structured_output_schema applied in the node extract_info_from_search) and apply "find_extra_info" to every element of this node. **The outputs are then collected as a list**

The end result is that we will do a different LLM API call for each entry that we found on product hunt separately. By doing so, we will get much better results than if we tried to find information about all entries at once.


#### **3.4 Combine workflows to compose final yaml**

The workflow we saw so far is able to retrieve information from product hunt and perform some extra web research on  companies there to find their revenue, number of users etc. **Now, we will embed this into a larger workflow** that takes a startup idea as input, runs the product hunt search workflow and creates a report explaining if there are potential competitors to the idea on recent product hunt launches, what are some market trends etc.

Let's start with a new yaml file. It is already saved locally in the repo, inside the examples folder, at **https://raw.githubusercontent.com/octopus2023-inc/gensphere/refs/heads/main/examples/startup_idea_evaluator.yaml**



```
#startup_idea_evaluator.yaml

nodes:
  - name: read_idea
    type: function_call
    function: read_file_as_string
    params:
      file_path: "domains_to_search.txt"
    outputs:
      - domains
      
  - name: product_hunt_analyzer
    type: yml_flow
    yml_file: product_hunt_analyzer.yaml
    outputs:
      - postprocessed_search_results
      - startup_extra_info
      
  - name: generate_report
    type: llm_service
    service: openai
    model: "gpt-4o-2024-08-06"
    params:
      prompt: |
         You are a world class VC analyst. You are currently analyzing the following startup idea:
         {{ read_idea.domains }}
         Your task is to help analyze this idea in face of recent launches in product hunt.
         Some recents launches in producthunt.com are:
         {{ product_hunt_analyzer.postprocessed_search_results }}
         Besides that, some extra information about these companies is:
         {{ product_hunt_analyzer.startup_extra_info }}.
        
         Given that, you should create a detailed report containing the following:
         1. An overview of recent launches in producthunt.com. What are the main ideas being explored?
         2. A list of companies from producthunt launches that may become direct competitors to the startup idea.
         Explain your rational
         3. Create a list of the most promising startups from the producthunt launches, as defined by their
         valuation, revenue, traction or other relevant metrics.
         4. A table containing all information you found from producthunt launches.
         
         Answer in markdown format.
         
    outputs:
      - report
```



Notice that we have a **yml_flow** node being referenced here,

```
  - name: product_hunt_analyzer
    type: yml_flow
    yml_file: yaml_file.yaml
    outputs:
      - postprocessed_search_results
      - startup_extra_info
```

In the yml_file field, we have yaml_file.yaml which is the path to the yaml file with the product analysis we were looking before. **That means this node will trigger the execution of the entire workflow defined on yaml_file.yaml.**

Now let's a new yaml, which we name "combined.yaml" with YamlCompose. GenSphere's class YamlCompose receives as input a yaml file, looks for yml_flow nodes there and resolves dependencies to create a final yaml file that is ready to run.

In [9]:
#copy the file from repo to local working directory in the notebook
!wget -O startup_idea_evaluator.yaml https://raw.githubusercontent.com/octopus2023-inc/gensphere/refs/heads/main/examples/startup_idea_evaluator.yaml


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



--2024-11-01 18:19:17--  https://raw.githubusercontent.com/octopus2023-inc/gensphere/refs/heads/main/examples/startup_idea_evaluator.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1780 (1.7K) [text/plain]
Saving to: ‘startup_idea_evaluator.yaml’


2024-11-01 18:19:17 (20.7 MB/s) - ‘startup_idea_evaluator.yaml’ saved [1780/1780]



In [11]:
composer=YamlCompose('startup_idea_evaluator.yaml',
                     'gensphere_functions.py',
                     'structured_output_schema.py')
combined_yaml_data=composer.compose(save_combined_yaml=True, output_file='combined.yaml')


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.

INFO:composio:Logging is set to INFO, use `logging_level` argument or `COMPOSIO_LOGGING_LEVEL` change this
INFO:composio:Logging is set to INFO, use `logging_level` argument or `COMPOSIO_LOGGING_LEVEL` change this


**Note:** When calling YamlCompose, you need to pass also the functions and schema files of the yaml file you want to parse. For simplicity, in our case you defined all functions we would need on the functions and schema files we pulled from the platform gensphere_functions.py, structured_output_schema.py

We can now visualize the combined yaml file, and check that the workflows have been correctly nested.

In [12]:
viz=Visualizer('combined.yaml',
               'gensphere_functions.py',
               'structured_output_schema.py',
               address='127.0.0.1', port=8050)
viz.start_visualization()


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.

INFO:gensphere.graph_builder:Total elements generated: 16

Dash.run_server is deprecated and will be removed in Dash 3.0



<IPython.core.display.Javascript object>

### **4. Run your project**




Having defined the yaml, functions and schema file, we can now trigger execution using the **GenFlow class**. We simply pass file paths to it and call the **".run()"** method.



The first node of combined.yaml, read_idea, expects a txt file saved locally as **"domains_to_search.txt"**. Let's create this file before executing the flow:

In [14]:
#create a save domains_to_search.txt

startup_idea="""
startup that creates interactive voice agents using generative AI with emphasis on applications like
language tutoring, entertainment or mental health. The business model would be B2C.
"""
with open("domains_to_search.txt", "w") as text_file:
    text_file.write(startup_idea)


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



In [15]:
logging.getLogger('composio').setLevel(logging.WARNING)
logging.getLogger('gensphere').setLevel(logging.DEBUG)
logging.getLogger('GenFlow').setLevel(logging.DEBUG)


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



In [16]:
flow=GenFlow('combined.yaml',
             'gensphere_functions.py',
             'structured_output_schema.py')
flow.parse_yaml()
flow.run()


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.

DEBUG:gensphere.yaml_utils:Validating YAML file '/content/combined.yaml'
INFO:gensphere.genflow:yaml file /content/combined.yaml passed all consistency checks
INFO:gensphere.genflow:Execution order: ['read_idea', 'product_hunt_analyzer__get_current_date', 'product_hunt_analyzer__get_timewindow', 'product_hunt_analyzer__product_hunt_scrape', 'product_hunt_analyzer__extract_info_from_search', 'product_hunt_analyzer__postprocess_search_results', 'product_hunt_analyzer__find_extra_info', 'generate_report']
INFO:gensphere.genflow.read_idea:Executing node 'read_idea'
INFO:gensphere.genflow.product_hunt_analyzer__get_current_date:Executing node 'product_hunt_analyzer__get_current_date'
INFO:gensphere.genflow.product_hunt_analyzer__get_timewindow:Executing 

After execution is complete, you can access results with the **.outputs** atribute of GenFlow, which returns a dict with every node as key, and their outputs as values.

In [17]:
final_node_output=flow.outputs.get("generate_report").get("report")

#visualize output
from IPython.display import display, Markdown
display(Markdown(final_node_output))


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



```markdown
# VC Analyst Report: Analysis of Startup Idea and Comparison with Product Hunt Launches

## 1. Overview of Recent Launches on Product Hunt

Recent launches on Product Hunt primarily showcase products aimed at enhancing productivity, communication, and leveraging AI for various applications. Here are the main ideas being explored:

- **AI-Powered Tools**: Many products leverage AI to improve efficiency and quality, such as Trag's AI-powered code review tool.
- **Productivity and Collaboration Platforms**: Several startups focus on productivity, such as KYZON Space for meetings and General Collaboration for unified communications.
- **Real-Time AI Interactions**: Video SDK 3.0 provides tools for creating real-time interactive video experiences.
- **Data Analytics and Insights**: buzzabout provides marketing insights based on large-scale data analysis.

The trend suggests a significant interest in using AI to drive efficiency and provide enhanced user experiences across various sectors.

## 2. Potential Direct Competitors

Below is a list of companies from the Product Hunt launches that could become direct competitors to the startup idea:

- **Video SDK 3.0**: Although primarily focused on video environments, its features for real-time, interactive AI characters could overlap with creating interactive voice agents in entertainment and tutoring applications. The emphasis on immersive experiences through real-time AI can be considered a competitive aspect.
  
**Rationale**: Video SDK 3.0’s expertise in real-time AI integration offers potential competition, especially if they expand offerings into interactive voice experiences.

## 3. Most Promising Startups

The most promising startups from the Product Hunt launches, based on available traction metrics and popularity, include:

- **KYZON Space**: Although detailed financial data is lacking, its significant upvotes on Product Hunt suggest strong user interest and potential traction.
- **Trag**: With high upvotes and robust functionality in AI-driven code review, this startup shows potential in the booming AI software tools sector.

**Note**: Due to the lack of specific revenue or valuation metrics provided, the assessment is based primarily on user interest as indicated by Product Hunt upvotes.

## 4. Table of Product Hunt Launches

Below is a summary table of key information concerning the recent Product Hunt launches:

| Product Name      | Company Name           | URL                                | Number of Upvotes | Business Model                                           | Brief Description                                                                                                                                              |
|-------------------|------------------------|------------------------------------|-------------------|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| KYZON Space       | KYZON                  | [kyzonspace.com](https://www.kyzonspace.com) | 1633              | Subscription-based SaaS for teams                         | Provides tools for transforming ideas into actionable meeting outcomes.                                                                                       |
| Trag              | Trag                   | [trag.com](https://www.trag.com)   | 1614              | Freemium model with GitHub integration                    | AI-powered tool for code review enhancing code quality and efficiency.                                                                                       |
| Video SDK 3.0     | Video SDK              | [videosdk.com](https://www.videosdk.com) | 1608              | Pay-as-you-go for developers                              | Allows integration of real-time AI characters for dynamic and interactive video environments.                                                                 |
| buzzabout         | buzzabout              | [buzzabout.ai](https://www.buzzabout.ai) | 1319              | Tiered pricing for data analysis services                 | Provides insights from online discussions to empower strategic marketing decisions.                                                                           |
| Feta              | Feta                   | [fetastandups.com](https://www.fetastandups.com) | 1195              | Subscription plans for team collaboration                 | Enhances stand-up meetings and productivity through advanced communication tools.                                                                             |
| General Collaboration | General Collaboration | [generalcollaboration.com](https://www.generalcollaboration.com) | 1163              | Basic and premium collaboration plans                     | Centralizes work discussions to enhance productivity and coordination in remote work setups.                                                                  |
| HeyForm 3.0       | HeyForm                | [heyform.com](https://www.heyform.com) | 1125              | Open-source with premium features                         | Open-source form builder tailored for small businesses.                                                                                                        |

*Note: The analysis is constrained by the lack of detailed financial metrics and relies heavily on the popularity and user engagement metrics provided by Product Hunt upvotes.*
```


### **5. Push to the platform**

After you have finished your project, you can now push your yaml, functions and schema to the platform. This will generate a push_id, that you or anyone else can use to pull your project locally. To do that, we simply call hub.push(), passing the path to the files. You can also add a brief description with "push_name".

In [20]:
hub=Hub(yaml_file='combined.yaml',
        functions_file='gensphere_functions.py',
        schema_file='structured_output_schema.py')
result=hub.push(push_name='workflow to analyze startup idea based on recent producthunt launches.')


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.

DEBUG:gensphere.yaml_utils:Validating YAML file '/content/combined.yaml'


In [28]:
print(f"push id is {result.get('push_id')}")
print(f"uploaded files are {result.get('uploaded_files')}")

push id is dc6ba3ae-8221-4264-94e9-bb805d9a1365
uploaded files are ['yaml_file.yaml', 'gensphere_functions.py', 'structured_output_schema.py']



`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



### **6. Check project popularity**

You can check how many times your project was pulled from the platform by using the 'count_pulls' method, and passing your push_id.

In [29]:
# Get the total number of pulls for the push_id
total_pulls = hub.count_pulls(push_id='de8afbeb-06cb-4f8f-8ead-64d9e6ef5326')
print(f"Total pulls for push_id: {total_pulls}")


`should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.



Total pulls for push_id: 5
