Development Branch #220

rvguha · 2025-06-18T22:43:40Z

Pulled all the changes from earlier unmerged branches into this. Resolved conflicts. We can now do testing on this and merge this with main.

Improve decontextualization by passing along previous results as part of the request.

More small changes to conversational history

…iple-Simultaneous-Retrieval-Backends" This reverts commit ffa1db3, reversing changes made to 6fbdbc6.

- also provides better abstraction - adds a number of tests

fix item details UI

- Resolved conflict in code/retrieval/retriever.py - Kept dynamic imports with auto-installation from main - Fixed variable references (db_type instead of self.db_type)

misc fixes and UI for ensembles

…endpoint

…sers

…l support

- Added AccompanimentHandler for finding complementary items (wines, sides, sauces) - Added SubstitutionHandler for recipe ingredient substitutions - Updated tools.xml with handler definitions for new tools - Added UI support for substitution suggestions in managed-event-source.js - Integrated new tools with the general handler-based routing approach 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Added podcast_scraper.py (renamed from npr_podcast_scraper_complete.py) for scraping podcast data - Added process_npr_rss_by_org.py for processing NPR RSS feeds by organization - Cherry-picked only the NPR-specific files from the PR to avoid reverting other changes 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Added modular testing framework with base_test_runner.py - Added specific test runners for end-to-end, site retrieval, and query retrieval tests - Added comprehensive test runner scripts (run_all_tests.sh/.bat, run_tests_comprehensive.sh) - Updated run_tests.py to be a test dispatcher that routes to appropriate test runners - Added JSON test configuration files for different test types - Updated testing README with comprehensive documentation - Preserved all existing functionality while adding new testing capabilities 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Moved retrieve_and_rank_for_query out as standalone method - Replaced get_vector_db_client with search abstraction - Fixed handling of 4-tuple search results - Skip tool selection when generate_mode is summarize or generate - Applied logging improvements from PR #204 to decontextualize.py

…der backend

code/tools/podcast_scraper.py

Updated documentation and release notes

…s.py - Kept the new _get_item_by_url() method from development (URL-based retrieval feature) - Preserved _send_no_items_found_message() method from both branches - Maintained all functionality from both branches

… development

Copilot

Pull Request Overview

This pull request consolidates many changes by merging previously unmerged branches into the development branch and resolving conflicts. The changes include adjustments and new features for event streaming and chat interfaces in JavaScript, extensive documentation updates, improvements to the retrieval system configuration, and numerous additions to the testing and utility scripts.

Reviewed Changes

Copilot reviewed 62 out of 67 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
static/streaming.js	Update parsing of “details” property and change from a bulleted list to a table for array formatting.
static/managed-event-source.js	Introduce new message types and debug message collection, as well as enhancements to tool selection flow.
docs/*	New and revised documentation on tools, retrieval, headers, and configuration guidelines.
code/utils/json_utils.py	New JSON trimming and merging functions with recursive merge logic.
code/tools/*	Refactoring of Qdrant loading and other tool modules, including asynchronous adaptations and configuration adjustments.
code/testing/*	Numerous new test runners and scripts for site retrieval, query retrieval, end-to-end testing and comprehensive test suite improvements.

Comments suppressed due to low confidence (1)

code/tools/db_load.py:922

Ensure that the configuration migration from 'preferred_retrieval_endpoint' to 'write_endpoint' is clearly documented in the migration guide to help developers adjust their setups.

    endpoint_name = database or CONFIG.write_endpoint

Copilot · 2025-06-19T05:07:05Z

code/utils/json_utils.py

+                elif val1 == val2:
+                    merged[key] = val1
+                else:
+                    # Create array with both values


Consider adding a comment to document the rationale for merging non-identical scalar values into an array, ensuring that future maintainers understand this intentional behavior.

Suggested change

# Create array with both values

# Create array with both values to preserve both conflicting scalar values.

# This ensures no data is lost during the merge process.

Copilot · 2025-06-19T05:07:05Z

code/tools/process_npr_rss_by_org.py

+                    stats['failed_conversion'] += 1
+
+                # Be respectful with requests
+                time.sleep(2)


[nitpick] Consider using an asynchronous sleep (e.g., await asyncio.sleep(2)) if the surrounding context is asynchronous to avoid blocking the event loop; if synchronous delay is intended, please document this decision.

Suggested change

time.sleep(2)

await asyncio.sleep(2)

…variables in our .env.template

rvguha and others added 23 commits June 15, 2025 13:36

Improve decontextualization

525eb44

Improve decontextualization by passing along previous results as part of the request.

More small changes to conversational history

b670047

More small changes to conversational history

Revert "Merge pull request #213 from microsoft/revert-182-Enable-Mult…

adcf9e5

…iple-Simultaneous-Retrieval-Backends" This reverts commit ffa1db3, reversing changes made to 6fbdbc6.

Fixes for multi database issues

64c233e

- also provides better abstraction - adds a number of tests

fix item details UI

c5b2274

fix item details UI

First pass at

98b1658

Merge main into Fix-multi-database-issues branch

18fcbb9

- Resolved conflict in code/retrieval/retriever.py - Kept dynamic imports with auto-installation from main - Fixed variable references (db_type instead of self.db_type)

misc fixes and UI for ensembles

de9782e

misc fixes and UI for ensembles

Create ensemble_test_queries.md

e57e3e7

Replacing all occurrences of preferred_retrieval_endpoint with write_…

4d5db7a

…endpoint

Setting defaults in config_retrieval.yaml that will be best for end u…

fe6522a

…sers

Fixed logic to verify valid credentials for retrievers

1cf9c96

Merge branch 'ensemble-tool' into development

3d5e1c6

Merge branch 'Fix-Item-Display-UI' into development

093cf55

Merge branch 'Fix-multi-database-issues' into development

dafab1f

Merge PR #206: Improve decontextualization and add URL-based retrieva…

bc63339

…l support

Merge branch 'Config-Improvements' into development

05bb837

Merge remote-tracking branch 'origin/main' into development

659e4c5

Fixed check connectivity script to work with multiple retrieval provi…

006e35a

…der backend

github-advanced-security bot found potential problems Jun 18, 2025

View reviewed changes

code/tools/podcast_scraper.py Dismissed Show dismissed Hide dismissed

Updated documentation and release notes

9a29909

Updated documentation and release notes

rvguha requested a review from chelseacarter29 June 19, 2025 02:19

rvguha and others added 3 commits June 18, 2025 19:24

Merge branch 'main' into development

a4f2b3d

Merge branch 'development' of https://github.com/microsoft/NLWeb into…

58f5526

… development

jennifermarsman requested a review from Copilot June 19, 2025 05:06

Copilot AI reviewed Jun 19, 2025

View reviewed changes

Changed to reasonable default retrievers for end users given the env …

dc45dd0

…variables in our .env.template

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Development Branch #220

Development Branch #220

rvguha commented Jun 18, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 19, 2025

Uh oh!

Copilot AI Jun 19, 2025

Uh oh!

Uh oh!

	# Create array with both values
	# Create array with both values to preserve both conflicting scalar values.
	# This ensures no data is lost during the merge process.

Development Branch #220

Are you sure you want to change the base?

Development Branch #220

Conversation

rvguha commented Jun 18, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!