Add logic within summarisation route to select appropriate stuff or map reduce method #681

andy-symonds · 2024-06-27T16:05:33Z

Context

When a user wants to summarise a document, they are not concerned with the method that is used to do the summarisation. Under-the-hood, however, when going down the summarisation route core-api needs to select the appropriate method, stuff for basic summarisation or map reduce for large document summarisation.

Changes proposed in this pull request

Reduced summarisation runnable to a single build_summary_chain that uses logic within the langchain runnable to decide which summarisation method to use, based on the size of the document, set by summarisation_chunk_max_tokens.

core-api now passes back to the frontend which summarisation method was used.

Guidance to review

Relevant links

Things to check

I have added any new ENV vars in all deployed environments
I have tested any code added or changed
I have run integration tests

…stuff or map reduce summarisation method based on length of list of strings

…o use the retrieval prompts

…n for basic summarisation and map reduce summarisation for large documents

redbox/models/chat.py

…eturned is now an error

wpfl-dbt

Mostly fine!

wpfl-dbt · 2024-06-28T15:11:35Z

core_api/src/build_chains.py

-    return (
-        RunnablePassthrough.assign(documents=(make_document_context | RunnableLambda(format_documents)))
-        | make_chat_prompt_from_messages_runnable(
+    # Stuff chain now missing the RunnabeLambda to format the chunks


This doesn't need to make it into production

wpfl-dbt · 2024-06-28T15:17:16Z

pyproject.toml

@@ -45,6 +45,7 @@ pytest = "^8.2.2"
 pytest-env = "^1.1.1"
 pytest-mock = "^3.14.0"
 pytest-cov = "^5.0.0"
+pytest-dotenv = "^0.5.2"


We already use pytest-env and tbh I strongly prefer it, but if we're going this way we should remove it. No one actually uses it.

tests/test_journey.py

wpfl-dbt

Happy

andy-symonds added 3 commits June 27, 2024 16:27

[REDBOX-410] | AS | Add logic within build_summary_chain to route to …

67eefb0

…stuff or map reduce summarisation method based on length of list of strings

[REDBOX-410] | AS | Corrected prompts used in build_retrieval_chain t…

93e4c01

…o use the retrieval prompts

[REDBOX-410] | AS | Added specific chat routes for stuff summarisatio…

906fe86

…n for basic summarisation and map reduce summarisation for large documents

jamesrichards4 reviewed Jun 28, 2024

View reviewed changes

redbox/models/chat.py Outdated Show resolved Hide resolved

jamesrichards4 added 2 commits June 28, 2024 13:57

Added tests and reworked summarisation tests to handle that no docs r…

11bcc32

…eturned is now an error

Merge branch 'main' into feat/runnable-logic-route-summarisation

1d75eed

jamesrichards4 marked this pull request as ready for review June 28, 2024 14:04

jamesrichards4 added 2 commits June 28, 2024 14:07

Added pytest-dotenv to dev dependencies

5713f0d

Removing journey_tests checking for specific route names

65802fe

wpfl-dbt reviewed Jun 28, 2024

View reviewed changes

Using route prefix names to allow matching on subroutes in summarisation

8367843

wpfl-dbt approved these changes Jun 28, 2024

View reviewed changes

jamesrichards4 merged commit ac1b9f1 into main Jun 28, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add logic within summarisation route to select appropriate stuff or map reduce method #681

Add logic within summarisation route to select appropriate stuff or map reduce method #681

andy-symonds commented Jun 27, 2024 •

edited by jamesrichards4

Loading

wpfl-dbt left a comment

wpfl-dbt Jun 28, 2024

wpfl-dbt Jun 28, 2024

wpfl-dbt left a comment

Add logic within summarisation route to select appropriate stuff or map reduce method #681

Add logic within summarisation route to select appropriate stuff or map reduce method #681

Conversation

andy-symonds commented Jun 27, 2024 • edited by jamesrichards4 Loading

Context

Changes proposed in this pull request

Guidance to review

Relevant links

Things to check

wpfl-dbt left a comment

Choose a reason for hiding this comment

wpfl-dbt Jun 28, 2024

Choose a reason for hiding this comment

wpfl-dbt Jun 28, 2024

Choose a reason for hiding this comment

wpfl-dbt left a comment

Choose a reason for hiding this comment

andy-symonds commented Jun 27, 2024 •

edited by jamesrichards4

Loading