-
-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add migrations for ML generated fields #1153
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
|
||
DROP TABLE vulnerability.code_snippet; | ||
|
||
-- Add a generated summary to the reference to make it easier for the LLM to choose what to read | ||
ALTER TABLE vulnerability.reference_content DROP COLUMN summary; | ||
|
||
ALTER TABLE package.package DROP COLUMN readme_text; | ||
ALTER TABLE package.package DROP COLUMN use_case_summary; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
|
||
CREATE TABLE vulnerability.code_snippet | ||
( | ||
id uuid DEFAULT public.gen_random_uuid() NOT NULL PRIMARY KEY, | ||
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL, | ||
-- Reference may be null because we may have pulled code from a non-web source such as vuln-db | ||
reference_id uuid NULL references vulnerability.reference, | ||
-- Include url since reference might be null but its still nice to be able to point a source like a vuln-db link for non-scraped content | ||
source_url text NOT NULL, | ||
vulnerability uuid NOT NULL references vulnerability.vulnerability, | ||
code text NOT NULL, | ||
score integer NOT NULL, | ||
summary text NOT NULL, | ||
type text NOT NULL, | ||
language text NOT NULL | ||
); | ||
|
||
-- Add a generated summary to the reference to make it easier for the LLM to choose what to read | ||
ALTER TABLE vulnerability.reference_content ADD COLUMN summary text NULL; | ||
|
||
ALTER TABLE package.package ADD COLUMN readme_text text NULL; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A package already has There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are we jamming the readme into that column? That isn't what I would expect, id expect it to have a short description. Also, often? When does it not? Depending on ecosystem? Lets sort this out in standup because im interested! |
||
ALTER TABLE package.package ADD COLUMN use_case_summary text NULL; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What will this column contain that is different than description? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed this a few times so I think you're familiar. By telling the LLM to summarize the use case specifically, and avoid all other descriptions, we get a much better vector proximity. It's not a general description, its just "what is this for" |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between this column and https://github.com/lunasec-io/lunasec/blob/master/lunatrace/bsl/hasura/migrations/lunatrace/1678406466712_add_parsed_content_to_reference_content/up.sql#LL1C81-L1C81?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also discussed this the other day, this is a short one to two sentence description of the content, so that we can display it in a list for the LLM to choose what to read.