Moved excess information and documentation to external dev doc #55826

tshaffercodeorg · 2024-01-18T20:35:51Z

The below is a follow-up fix for the conversation in this PR: #55633

Executive Summary

Migrated background information and model evaluation scripts/logs to external google drive folder
Streamlined readme.md and inline documentation

…into tyrone/hoc2023-ai-documentation

thomasoniii

lgtm 🚀

fisher-alice

Thanks for iterating on this! Just left a handful of minor comments/suggestions.

fisher-alice · 2024-01-24T17:45:12Z

apps/script/HoC2023ScriptFiles/HoC2023AiHelperFunctions.py

-    # Math explanation: Cosine distance outputs a value between 0 -> 1 where smaller values = greater similarity
-    # We can redefine this into cosine similarity with a simple (x-1)*-1 due to their mathematical relationship
-    # Since we take the SUM(MAX(similarity)) value when determining which options to present the user, cosine similarity is preferable
+    # Conversion from cosine distance to cosine similarity for easier readability in frontend computations.


Nit: front-end since used as adjective. (Sorry I didn't include in prior comment.)

fisher-alice · 2024-01-24T17:48:13Z

apps/script/HoC2023ScriptFiles/HoC2023AiHelperFunctions.py

+    # Conversion from cosine distance to cosine similarity for easier readability in frontend computations.
+    # Math explanation: Cosine distance outputs a value between 0 -> 1 where smaller values = greater similarity.
+    # Cosine similarity redefines this relationship so that instead larger values = greater ssimilarity.
+    # Since we expose some of these values to students in the frontend, we felt that similarity values growing larger would be easier to understand.


Nit suggestion: 'Since we expose some of these values to users in the frontend, we felt that greater values = greater similarity would be easier to understand.'

fisher-alice · 2024-01-24T17:48:28Z

apps/script/HoC2023ScriptFiles/HoC2023AiHelperFunctions.py

-    # Since we take the SUM(MAX(similarity)) value when determining which options to present the user, cosine similarity is preferable
+    # Conversion from cosine distance to cosine similarity for easier readability in frontend computations.
+    # Math explanation: Cosine distance outputs a value between 0 -> 1 where smaller values = greater similarity.
+    # Cosine similarity redefines this relationship so that instead larger values = greater ssimilarity.


Nit: typo 'similarity'

fisher-alice · 2024-01-24T17:54:58Z

apps/script/HoC2023ScriptFiles/README.md

@@ -7,7 +7,7 @@ Run the `HoC2023AiGenerateWeights.py` to generate the associated output weights

 Before running the script, make sure to adjust your local parameters based off the current model being used. Previous iterations have leveraged spaCy and OpenAI's Ada models and it is not unreasonable to anticipate that the model "vendor" may change again in the future.

-As of 01/08/2024, this script uses AWS's Titan v1 LLM through their Bedrock API.
+As of 01/08/2024, this script uses AWS's Titan v1 LLM through their Bedrock API. For additional background context and testing resources, check the google drive here: https://docs.google.com/document/d/1beDoalfB1Y7BybN82YGhuzTNos_TE5l0dX4XdKPzNdw/edit?usp=sharing


You could link text to URL as done in javabuilder READAME
'For additional background context and testing resources, see the Dance AI Design Dev Doc.'

Also - could you define 'LLM' the first time it's introduced in the doc?

fisher-alice · 2024-01-24T17:56:29Z

apps/script/HoC2023ScriptFiles/README.md

-measure the relatedness of text strings. The distance between two vectors measures
-their relatedness. https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture
-These embeddings are stored in caches files as pickle files, python's native way to serialize data.
+The embeddings used to generate these maps are stored in cached pickle files such as foreground_embeddings.pkl to prevent duplicate LLM API calls.


Nit: use backticks for file name:
'The embeddings used to generate these maps are stored in cached pickle files such as foreground_embeddings.pkl to prevent duplicate LLM API calls.'

fisher-alice · 2024-01-24T20:21:36Z

apps/script/HoC2023ScriptFiles/README.md


-At runtime, DanceAI will use the three maps to lookup the scores for each output type and take the top three indexes of (MAX(SUM(Input1Scores, Input2Scores, Input3Scores))) to select a final palette/foreground/background to display to the user. These maps are stored as a local cache rather than generated at runtime to remove the costs associated with querying a LLM and improve runtime performance.
+At runtime, DanceAI will use the three maps to lookup the scores for each output type and randomly select one of the top 3 results of MAX(SUM(Input1Scores, Input2Scores, Input3Scores)) to select a final palette/foreground/background to display to the user. These maps are stored as a local cache rather than generated at runtime to remove the costs associated with querying a LLM and improve runtime performance.


Since on the frontend, we present both randomly selected top version (from top 3) and randomly selected bottom versions ( 4 from bottom 20), maybe we can just state that the three maps are used to look up scores for each output type. We could also refer them to the calculateOutputSummedWeights.ts file.

…tation Moved excess information and documentation to external dev doc

tshaffercodeorg added 6 commits January 18, 2024 12:56

Moved excess information and documentation to external dev doc

fb06e56

Merge branch 'staging' of https://github.com/code-dot-org/code-dot-org …

47bb0d3

…into tyrone/hoc2023-ai-documentation

Adjusted verbose math explanation

9b8c18f

Additional verbage adjustments

0a61018

Updated link to google drive for proper sharing permissions

119b1a1

Moved embeddings explanation to dev doc

346dbbe

tshaffercodeorg requested a review from fisher-alice January 23, 2024 22:13

fisher-alice requested a review from a team January 24, 2024 16:47

thomasoniii approved these changes Jan 24, 2024

View reviewed changes

fisher-alice approved these changes Jan 24, 2024

View reviewed changes

tshaffercodeorg merged commit f57ff0e into staging Jan 26, 2024
2 checks passed

tshaffercodeorg deleted the tyrone/hoc2023-ai-documentation branch January 26, 2024 18:37

mikeharv pushed a commit that referenced this pull request Feb 5, 2024

Merge pull request #55826 from code-dot-org/tyrone/hoc2023-ai-documen…

4d93aa1

…tation Moved excess information and documentation to external dev doc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moved excess information and documentation to external dev doc #55826

Moved excess information and documentation to external dev doc #55826

tshaffercodeorg commented Jan 18, 2024

thomasoniii left a comment

fisher-alice left a comment

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024

fisher-alice Jan 24, 2024


		At runtime, DanceAI will use the three maps to lookup the scores for each output type and take the top three indexes of (MAX(SUM(Input1Scores, Input2Scores, Input3Scores))) to select a final palette/foreground/background to display to the user. These maps are stored as a local cache rather than generated at runtime to remove the costs associated with querying a LLM and improve runtime performance.
		At runtime, DanceAI will use the three maps to lookup the scores for each output type and randomly select one of the top 3 results of MAX(SUM(Input1Scores, Input2Scores, Input3Scores)) to select a final palette/foreground/background to display to the user. These maps are stored as a local cache rather than generated at runtime to remove the costs associated with querying a LLM and improve runtime performance.

Moved excess information and documentation to external dev doc #55826

Moved excess information and documentation to external dev doc #55826

Conversation

tshaffercodeorg commented Jan 18, 2024

Executive Summary

thomasoniii left a comment

Choose a reason for hiding this comment

fisher-alice left a comment

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment

fisher-alice Jan 24, 2024

Choose a reason for hiding this comment