-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Add AI research agent using smolagents #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughA new research assistant application is introduced using the Smol Agents framework. The update adds a Streamlit-based interface, agent logic for enhanced web search and webpage retrieval, supporting utilities, and documentation. Dependency management and environment variable configuration are provided via requirements and example environment files. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Streamlit UI
participant ManagerAgent
participant ToolCallingAgent
participant EnhancedSearchTool
participant VisitWebpageTool
User->>Streamlit UI: Enter search query and settings
Streamlit UI->>ManagerAgent: Run manager agent with query and parameters
ManagerAgent->>ToolCallingAgent: Delegate search task
ToolCallingAgent->>EnhancedSearchTool: Perform enhanced search
EnhancedSearchTool->>ToolCallingAgent: Return search results
ToolCallingAgent->>VisitWebpageTool: Fetch webpage content (as needed)
VisitWebpageTool->>ToolCallingAgent: Return webpage content
ToolCallingAgent->>ManagerAgent: Return aggregated results
ManagerAgent->>Streamlit UI: Provide formatted response
Streamlit UI->>User: Display results, sources, analysis, and logs
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (5)
researchagent-smolagents/README.md (1)
1-12
: Polish README wording and lint issues for professionalism & discoverability.
- Typo: “AI Reseach Agent” ⇒ “AI Research Agent”.
- Style: “going to be” → “will be”.
- Provide a descriptive link title instead of a bare URL (MD034).
- Capitalise “Hugging Face”.
-# AI Reseach Agent -A simple python application to demonstrate how agentic era is going to be. Used Smol Agents framework by hugging face. +# AI Research Agent +A simple Python application demonstrating how the agentic era will be. +Built with the Smol Agents framework by **Hugging Face**. @@ -https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30 +[Watch the demo video](https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30)🧰 Tools
🪛 LanguageTool
[style] ~2-~2: Use ‘will’ instead of ‘going to’ if the following action is certain.
Context: ...lication to demonstrate how agentic era is going to be. Used Smol Agents framework by huggi...(GOING_TO_WILL)
🪛 markdownlint-cli2 (0.17.2)
6-6: Bare URL used
null(MD034, no-bare-urls)
researchagent-smolagents/agents.py (4)
10-20
: Remove unused imports to keep the module clean.
timedelta
andDuckDuckGoSearchTool
aren’t referenced after import.-from datetime import datetime, timedelta +from datetime import datetime @@ - DuckDuckGoSearchTool,🧰 Tools
🪛 Ruff (0.8.2)
10-10:
datetime.timedelta
imported but unusedRemove unused import:
datetime.timedelta
(F401)
18-18:
smolagents.DuckDuckGoSearchTool
imported but unusedRemove unused import:
smolagents.DuckDuckGoSearchTool
(F401)
210-223
: Log buffer may grow unbounded – reset or cap size to prevent memory bloat.
st.session_state.log_container
accumulates every search’s logs for the lifetime of the session. For long sessions this can degrade performance.Minimal fix: keep only the last n entries.
- st.session_state.log_container.extend(captured_logs) + st.session_state.log_container.extend(captured_logs) + # Cap to last 1 000 lines + max_logs = 1000 + if len(st.session_state.log_container) > max_logs: + st.session_state.log_container = st.session_state.log_container[-max_logs:]
394-396
: Combine nestedwith
statements for cleaner, more readable code.- with st.expander(f"Source {idx}: {source}"): - # Add a loading spinner while fetching content - with st.spinner(f"Loading content from source {idx}..."): + with st.expander(f"Source {idx}: {source}"), \ + st.spinner(f"Loading content from source {idx}..."):This follows the Ruff SIM117 recommendation and reduces indentation.
🧰 Tools
🪛 Ruff (0.8.2)
394-396: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
267-270
: Cache expiry missing – external pages may change or vanish.
@st.cache_data
without attl
keeps content forever and can serve stale pages.Add a reasonable TTL:
-@st.cache_data +@st.cache_data(ttl=60 * 60) # 1 hour
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
researchagent-smolagents/.env
(1 hunks)researchagent-smolagents/README.md
(1 hunks)researchagent-smolagents/agents.py
(1 hunks)researchagent-smolagents/requirements.txt
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
researchagent-smolagents/README.md
[style] ~2-~2: Use ‘will’ instead of ‘going to’ if the following action is certain.
Context: ...lication to demonstrate how agentic era is going to be. Used Smol Agents framework by huggi...
(GOING_TO_WILL)
🪛 markdownlint-cli2 (0.17.2)
researchagent-smolagents/README.md
6-6: Bare URL used
null
(MD034, no-bare-urls)
🪛 Ruff (0.8.2)
researchagent-smolagents/agents.py
10-10: datetime.timedelta
imported but unused
Remove unused import: datetime.timedelta
(F401)
18-18: smolagents.DuckDuckGoSearchTool
imported but unused
Remove unused import: smolagents.DuckDuckGoSearchTool
(F401)
89-89: Local variable response
is assigned to but never used
Remove assignment to unused variable response
(F841)
394-396: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
researchagent-smolagents/.env
Outdated
#add you hugging face token | ||
HF_TOKEN="hf_..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid committing real or placeholder secrets – ship an example file instead.
Including .env
in the repo risks leaking real credentials and trains newcomers to commit secrets.
• Rename this file to .env.example
and replace the value with a clear placeholder (HF_TOKEN="<YOUR_TOKEN_HERE>"
).
• Add .env
to .gitignore
so personal tokens stay local.
-#add you hugging face token
-HF_TOKEN="hf_..."
+# Example – copy to `.env`, fill with your token and keep that file un-tracked.
+HF_TOKEN="<YOUR_TOKEN_HERE>"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#add you hugging face token | |
HF_TOKEN="hf_..." | |
# Example – copy to `.env`, fill with your token and keep that file un-tracked. | |
HF_TOKEN="<YOUR_TOKEN_HERE>" |
streamlit | ||
requests | ||
markdownify | ||
huggingface_hub | ||
smolagents | ||
python-dotenv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Missing runtime dependencies & pinning – app will crash in prod.
agents.py
directly imports pandas
and bs4
(BeautifulSoup
) but they are not listed here. The app will raise ModuleNotFoundError
in a clean environment.
streamlit
requests
markdownify
huggingface_hub
smolagents
python-dotenv
+pandas
+beautifulsoup4
Consider pinning versions (e.g., streamlit>=1.34,<2
) for deterministic builds.
#!/bin/bash
# Search for pandas imports across the codebase
rg -n "import pandas"
rg -n "from pandas"
# Search for bs4 or BeautifulSoup usage
rg -n "import bs4"
rg -n "from bs4"
rg -n "BeautifulSoup"
🏁 Script executed:
#!/bin/bash
# Locate agents.py in the repository
file=$(fd -I agents.py)
echo "agents.py found at: $file"
# Search for pandas imports/usage
rg -n "import pandas" "$file"
rg -n "from pandas" "$file"
# Search for bs4/BeautifulSoup imports/usage
rg -n "import bs4" "$file"
rg -n "from bs4" "$file"
rg -n "BeautifulSoup" "$file"
Length of output: 1577
#!/bin/bash
# Search for pandas and BeautifulSoup usage in all Python files
rg -n -e "import pandas" -e "from pandas" -e "import bs4" -e "from bs4" -e "BeautifulSoup" --glob '*.py'
#!/bin/bash
# Search for pandas and BeautifulSoup usage across all Python files
rg -n -e "import pandas" \
-e "from pandas" \
-e "import bs4" \
-e "from bs4" \
-e "BeautifulSoup" \
--glob "*.py"
🏁 Script executed:
#!/bin/bash
# Verify pandas & BeautifulSoup imports in the specific agents.py file
file="researchagent-smolagents/agents.py"
if [ -f "$file" ]; then
echo "Scanning $file for dependencies…"
rg -n "import pandas" "$file"
rg -n "from pandas" "$file"
rg -n "import bs4" "$file"
rg -n "from bs4" "$file"
rg -n "BeautifulSoup" "$file"
else
echo "File not found: $file"
fi
Length of output: 693
Add missing dependencies and pin versions in requirements.txt
The file researchagent-smolagents/agents.py
imports runtime libraries that aren’t listed in requirements.txt
, which will cause ModuleNotFoundError
in a clean environment:
• researchagent-smolagents/agents.py:3
import pandas as pd
• researchagent-smolagents/agents.py:110–111
from bs4 import BeautifulSoup
/ soup = BeautifulSoup(…)
Please update researchagent-smolagents/requirements.txt
:
streamlit
requests
markdownify
huggingface_hub
smolagents
python-dotenv
+pandas
+beautifulsoup4
For deterministic builds, consider pinning versions. For example:
streamlit>=1.34,<2.0
pandas>=2.0,<3.0
beautifulsoup4>=4.12,<5.0
researchagent-smolagents/agents.py
Outdated
# First request to get token | ||
response = requests.get('https://html.duckduckgo.com/html/', headers=headers) | ||
|
||
# Extract search parameters | ||
search_params = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Unused HTTP request wastes latency and bandwidth.
response = requests.get('https://html.duckduckgo.com/html/', …)
is assigned but never used.
Delete the call or utilise the headers/cookies it should return.
- # First request to get token
- response = requests.get('https://html.duckduckgo.com/html/', headers=headers)
+ # NOTE: Removed pre-flight request – not required for DuckDuckGo HTML endpoint.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# First request to get token | |
response = requests.get('https://html.duckduckgo.com/html/', headers=headers) | |
# Extract search parameters | |
search_params = { | |
# NOTE: Removed pre-flight request – not required for DuckDuckGo HTML endpoint. | |
# Extract search parameters | |
search_params = { |
🧰 Tools
🪛 Ruff (0.8.2)
89-89: Local variable response
is assigned to but never used
Remove assignment to unused variable response
(F841)
researchagent-smolagents/agents.py
Outdated
def visit_webpage(url: str) -> str: | ||
"""Visits a webpage and converts its content to markdown format. | ||
|
||
Args: | ||
url: The complete URL of the webpage to visit (e.g., 'https://example.com'). | ||
Must be a valid HTTP or HTTPS URL. | ||
|
||
Returns: | ||
str: The webpage content converted to Markdown format with the reference webpages links. | ||
Returns an error message if the request fails. | ||
""" | ||
try: | ||
response = requests.get(url) | ||
response.raise_for_status() | ||
markdown_content = markdownify(response.text).strip() | ||
markdown_content = re.sub(r"\n{3,}", "\n\n", markdown_content) | ||
return markdown_content | ||
except RequestException as e: | ||
return f"Error fetching the webpage: {str(e)}" | ||
except Exception as e: | ||
return f"An unexpected error occurred: {str(e)}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout & headers when fetching external pages to avoid hangs and blocks.
requests.get(url)
without a timeout can freeze Streamlit workers; some sites also block default user-agents.
- response = requests.get(url)
+ response = requests.get(
+ url,
+ timeout=15,
+ headers={"User-Agent": "Mozilla/5.0 (WebResearchBot/1.0)"}
+ )
Also consider limiting download size (e.g., via stream=True
and iter_content
) to mitigate huge pages.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def visit_webpage(url: str) -> str: | |
"""Visits a webpage and converts its content to markdown format. | |
Args: | |
url: The complete URL of the webpage to visit (e.g., 'https://example.com'). | |
Must be a valid HTTP or HTTPS URL. | |
Returns: | |
str: The webpage content converted to Markdown format with the reference webpages links. | |
Returns an error message if the request fails. | |
""" | |
try: | |
response = requests.get(url) | |
response.raise_for_status() | |
markdown_content = markdownify(response.text).strip() | |
markdown_content = re.sub(r"\n{3,}", "\n\n", markdown_content) | |
return markdown_content | |
except RequestException as e: | |
return f"Error fetching the webpage: {str(e)}" | |
except Exception as e: | |
return f"An unexpected error occurred: {str(e)}" | |
def visit_webpage(url: str) -> str: | |
"""Visits a webpage and converts its content to markdown format. | |
Args: | |
url: The complete URL of the webpage to visit (e.g., 'https://example.com'). | |
Must be a valid HTTP or HTTPS URL. | |
Returns: | |
str: The webpage content converted to Markdown format with the reference webpages links. | |
Returns an error message if the request fails. | |
""" | |
try: | |
response = requests.get( | |
url, | |
timeout=15, | |
headers={"User-Agent": "Mozilla/5.0 (WebResearchBot/1.0)"} | |
) | |
response.raise_for_status() | |
markdown_content = markdownify(response.text).strip() | |
markdown_content = re.sub(r"\n{3,}", "\n\n", markdown_content) | |
return markdown_content | |
except RequestException as e: | |
return f"Error fetching the webpage: {str(e)}" | |
except Exception as e: | |
return f"An unexpected error occurred: {str(e)}" |
|
||
## Demo Video | ||
|
||
https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This links results in Error 404 (Page not found) to me
PR Review: Add AI research agent using smolagentsThank you for this contribution! I've reviewed the PR which adds a new AI research agent using the smolagents framework. Here's my feedback: Strengths
Suggested Improvements1. Security & Best Practices
2. Dependencies
3. Documentation
4. Performance Considerations
Minor Issues
Overall, this is a valuable addition to the repository. With the suggested improvements, it will be even more robust and user-friendly. Great work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
researchagent-smolagents/agents.py (1)
258-269
:⚠️ Potential issueIncomplete implementation of search functionality
The code ends abruptly with an incomplete block after setting up the tabs. The search functionality appears to be incomplete.
Complete the search functionality by adding the following code:
result_tab, sources_tab, analysis_tab, logs_tab = st.tabs( ["📝 Results", "🔗 Sources", "📊 Analysis", "📋 Logs"] ) + + # Run search with the manager agent + result = manager_agent.run( + f""" + Research query: {query} + Time period: {time_period} + Search depth: {search_depth} + Max results: {max_results} + + Please provide comprehensive research on this topic, including: + 1. Key facts and information + 2. Different perspectives or viewpoints + 3. Recent developments or updates + 4. A summary of your findings + """ + ) + + # Capture logs from output + for timestamp, log in zip(output.timestamps, output.logs): + st.session_state.log_container.append((timestamp, log)) + + # Display results in tabs + with result_tab: + st.markdown(format_agent_response(result)) + + with sources_tab: + if isinstance(result, dict) and 'Results' in result: + for idx, source in enumerate(result['Results'], 1): + st.markdown(f"### Source {idx}: {source.get('title', 'No Title')}") + st.markdown(f"**URL:** {source.get('url', 'No URL')}") + st.markdown(f"**Recency Score:** {source.get('recency_score', 0.0):.2f}") + with st.expander("View Source Content", expanded=False): + st.markdown(fetch_webpage_content(source['url'])[:5000] + "...") + + with analysis_tab: + if isinstance(result, dict) and 'thoughts' in result: + st.markdown(result['thoughts']) + else: + st.info("No detailed analysis available.") + + with logs_tab: + for timestamp, log in st.session_state.log_container: + st.text(f"[{timestamp}] {log}") + + except Exception as e: + st.error(f"An error occurred during research: {str(e)}")🧰 Tools
🪛 Ruff (0.11.9)
269-269: SyntaxError: Expected
except
orfinally
aftertry
block
♻️ Duplicate comments (2)
researchagent-smolagents/agents.py (2)
68-68
: Remove unused HTTP requestThis request is assigned but never used. It's simply wasting bandwidth and adding latency.
- response = requests.get('https://html.duckduckgo.com/html/', headers=headers)
128-138
: Add timeout & headers when fetching external pages to avoid hangs and blocks
requests.get(url)
without a timeout can freeze Streamlit workers; some sites also block default user-agents.- response = requests.get(url) + response = requests.get( + url, + timeout=15, + headers={"User-Agent": "Mozilla/5.0 (WebResearchBot/1.0)"} + )Also consider limiting download size (e.g., via
stream=True
anditer_content
) to mitigate huge pages.
🧹 Nitpick comments (2)
researchagent-smolagents/agents.py (1)
211-214
: Webpage content fetching doesn't match cache key patternThe
fetch_webpage_content
function is decorated with@st.cache_data
, but it doesn't properly cache results based on changes in input parameters. The function will be recalled unnecessarily when the same URL is visited multiple times with different query parameters.@st.cache_data -def fetch_webpage_content(url): +def fetch_webpage_content(url: str) -> str: + """Fetch and convert webpage content to markdown with caching""" return visit_webpage(url)researchagent-smolagents/README.md (1)
6-6
: Use proper Markdown link syntax instead of bare URLThe GitHub asset link should use proper Markdown link syntax for better readability and compatibility.
-https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30 +[View Demo](https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30)Also note that this link results in a 404 error as mentioned in a previous review. Please verify the correct URL.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
6-6: Bare URL used
null(MD034, no-bare-urls)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
researchagent-smolagents/.env.example
(1 hunks)researchagent-smolagents/README.md
(1 hunks)researchagent-smolagents/agents.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- researchagent-smolagents/.env.example
🧰 Additional context used
🪛 LanguageTool
researchagent-smolagents/README.md
[uncategorized] ~2-~2: You might be missing the article “the” here.
Context: ...e python application to demonstrate how agentic era is going to be. Used Smol Agents fr...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[style] ~2-~2: Use ‘will’ instead of ‘going to’ if the following action is certain.
Context: ...lication to demonstrate how agentic era is going to be. Used Smol Agents framework by huggi...
(GOING_TO_WILL)
🪛 markdownlint-cli2 (0.17.2)
researchagent-smolagents/README.md
6-6: Bare URL used
null
(MD034, no-bare-urls)
return { | ||
'Results': results[:10], | ||
'meta': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Respect user's max_results setting
The hard-coded limit of 10 results doesn't respect the user's max_results setting from the sidebar.
- 'Results': results[:10],
+ 'Results': results[:max_results],
However, this would require passing max_results
as a parameter to the enhanced_search
function. Consider refactoring to make this function respect user preferences.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In researchagent-smolagents/agents.py around lines 106 to 108, the function
currently limits results to 10 regardless of the user's max_results setting.
Modify the function signature to accept a max_results parameter and replace the
hard-coded 10 with this parameter to respect user preferences. Ensure that when
calling enhanced_search, the max_results value from the user settings is passed
accordingly.
search_params = { | ||
'q': time_keywords, | ||
's': '0', | ||
'dc': '20', | ||
'v': 'l', | ||
'o': 'json', | ||
'api': '/d.js', | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Time period parameter is accepted but not used
The function accepts a time_period
parameter but doesn't use it in the search parameters. Consider adding it to the search parameters to respect user's time period preference.
search_params = {
'q': time_keywords,
's': '0',
'dc': '20',
+ 'df': time_period,
'v': 'l',
'o': 'json',
'api': '/d.js',
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
search_params = { | |
'q': time_keywords, | |
's': '0', | |
'dc': '20', | |
'v': 'l', | |
'o': 'json', | |
'api': '/d.js', | |
} | |
search_params = { | |
'q': time_keywords, | |
's': '0', | |
'dc': '20', | |
'df': time_period, | |
'v': 'l', | |
'o': 'json', | |
'api': '/d.js', | |
} |
🤖 Prompt for AI Agents
In researchagent-smolagents/agents.py around lines 70 to 77, the time_period
parameter is accepted by the function but not included in the search_params
dictionary. To fix this, add the time_period value to the search_params with the
appropriate key expected by the API, ensuring the search respects the user's
time period preference.
# AI Research Agent | ||
A simple python application to demonstrate how agentic era is going to be. Used Smol Agents framework by hugging face. | ||
|
||
## Demo Video | ||
|
||
https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30 | ||
|
||
## Task Performed | ||
- AI agent to grasp information on the topic given. | ||
- AI agent to scrape the information and collect the necessary information about the topic. | ||
- AI agent to summarize the whole topic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance README with setup instructions and usage examples
The README lacks critical information about installation, setup, and usage. Consider adding:
- Installation instructions (pip install requirements)
- Setup instructions (.env file configuration)
- Usage examples (how to run the Streamlit app)
- Dependencies and requirements
- Fix grammar in the introduction
Here's a suggested enhancement:
# AI Research Agent
-A simple python application to demonstrate how agentic era is going to be. Used Smol Agents framework by hugging face.
+A simple Python application demonstrating the agentic era using the Smol Agents framework by Hugging Face.
## Demo Video
-https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30
+[View Demo](https://github.com/user-attachments/assets/1f876f9c-3dff-4548-b14c-bb6439eede30)
-## Task Performed
+## Tasks Performed
- AI agent to grasp information on the topic given.
- AI agent to scrape the information and collect the necessary information about the topic.
- AI agent to summarize the whole topic.
+
+## Installation
+
+```bash
+pip install -r requirements.txt
+```
+
+## Setup
+
+1. Create a `.env` file in the project root directory
+2. Add your Hugging Face API token to the `.env` file:
+ ```
+ HF_TOKEN=your_huggingface_token_here
+ ```
+
+## Usage
+
+Run the Streamlit app:
+
+```bash
+streamlit run agents.py
+```
+
+Then open your browser to http://localhost:8501
🧰 Tools
🪛 LanguageTool
[uncategorized] ~2-~2: You might be missing the article “the” here.
Context: ...e python application to demonstrate how agentic era is going to be. Used Smol Agents fr...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[style] ~2-~2: Use ‘will’ instead of ‘going to’ if the following action is certain.
Context: ...lication to demonstrate how agentic era is going to be. Used Smol Agents framework by huggi...
(GOING_TO_WILL)
🪛 markdownlint-cli2 (0.17.2)
6-6: Bare URL used
null
(MD034, no-bare-urls)
🤖 Prompt for AI Agents
In researchagent-smolagents/README.md lines 1 to 11, the README is missing
essential setup and usage information. Add installation instructions for
dependencies using pip install -r requirements.txt, provide setup steps for
creating a .env file with the Hugging Face API token, include usage instructions
on how to run the Streamlit app with streamlit run agents.py, and mention the
URL to access the app. Also, correct grammar issues in the introduction for
clarity.
This PR introduces a powerful, interactive Web Research Assistant built using Streamlit, integrating enhanced search tools, web scraping, AI agents, and markdown-based summarization. The tool is designed to provide recent, relevant, and readable information from the web, driven by the Hugging Face Qwen2.5-Coder-32B-Instruct model and DuckDuckGo search APIs.
🧠 Key Features:
Streamlit Interface: Intuitive, real-time search assistant with multi-tab results display (Results, Sources, Analysis, Logs).
Install the dependencies
Add your hugging face token
Summary by CodeRabbit