You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered an issue with TaskWeaver's code interpreter when analyzing a CSV file containing customer complaints. After uploading the CSV and requesting the top 5 themes of the complaints, the interpreter generated Python code utilizing Latent Dirichlet Allocation (LDA) for topic modeling. However, the resulting themes were merely lists of words without meaningful context, which doesn't meet the expectation of coherent thematic summaries.
Steps to Reproduce:
Upload a CSV file with a column containing textual data of customer complaints.
Ask TaskWeaver: "Provide the top 5 themes of the complaints."
Expected Behavior: The Large Language Model (LLM) should interpret the complaints and provide coherent thematic summaries, such as:
Billing Issues
Service Delays
Product Quality Concerns
Customer Support Complaints
Account Access Problems
Actual Behavior: The code interpreter generated the following Python code:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
# Vectorize the complaint texts
vectorizer = CountVectorizer(stop_words='english')
X = vectorizer.fit_transform(df['COMPLAINT_SUM_TXT'])
# Apply LDA for topic modeling
lda = LatentDirichletAllocation(n_components=5, random_state=0)
lda.fit(X)
# Get the top words for each topic
def get_top_words(model, feature_names, n_top_words):
top_words = []
for topic_idx, topic in enumerate(model.components_):
top_words.append([feature_names[i] for i in topic.argsort()[:-n_top_words - 1:-1]])
return top_words
n_top_words = 5
feature_names = vectorizer.get_feature_names_out()
top_themes = get_top_words(lda, feature_names, n_top_words)
top_themes
These outputs are lists of words without clear thematic context, making it difficult to derive actionable insights.
Is there a way to enhance the taskweaver to use LLM or invoke an agent/plugin to work on output from the code and generate expected themes? Any suggestions or different approaches please provide. Thank you!
The text was updated successfully, but these errors were encountered:
I do not have data for reproducing, but I can see there are different ways:
You can build a plugin which takes the CSV file as input so you can control the logic in the plugin. This is more suitable if you consider this flow as a common one and you need to analyze many similar files. Espeically, if you have a pre-defined list of themes, this is the best way to incorperate it into the flow.
You can break the flow into two steps, e.g., first load the file and display the content, and then ask the agent to do the second step. This is ok if the file is small and can be handled by LLM entirely in one prompt. In addition, this is more for casual analysis, which is a one time issue.
You can add an example to the framework to demonstrate the desired flow the agent should follow. This is in between the two above.
@liqul Thank you for your response. Themes question is a sample one, for any analytics question, taskweaver is trying to solve it using code-interpreter code.
When I upload the same CSV in Copilot and ask the same questions, it responds as expected. Is there a way to bring such functionality into Taskweaver?
I encountered an issue with TaskWeaver's code interpreter when analyzing a CSV file containing customer complaints. After uploading the CSV and requesting the top 5 themes of the complaints, the interpreter generated Python code utilizing Latent Dirichlet Allocation (LDA) for topic modeling. However, the resulting themes were merely lists of words without meaningful context, which doesn't meet the expectation of coherent thematic summaries.
Steps to Reproduce:
Expected Behavior: The Large Language Model (LLM) should interpret the complaints and provide coherent thematic summaries, such as:
Actual Behavior: The code interpreter generated the following Python code:
This code produced the following output:
These outputs are lists of words without clear thematic context, making it difficult to derive actionable insights.
Is there a way to enhance the taskweaver to use LLM or invoke an agent/plugin to work on output from the code and generate expected themes? Any suggestions or different approaches please provide. Thank you!
The text was updated successfully, but these errors were encountered: