ML: add defensive check to ensure Unknown endpoints cannot also be NotASink #8546

jhelie · 2022-03-24T12:34:56Z

As discussed with @henrymercer this PR adds a defensive check in the getAnUnknown() predicate in ExtractEndpointData.qll. This should now ensure that the label of an endpoint has a unique value in {NotASink, Sink, Unkknown} for a given query.

The root problem should be fixed upstream by improving the endpoint filters, see https://github.com/github/ml-ql-adaptive-threat-modeling/issues/1818.

PR checks:

✅ here is the dev pipeline triggered using the proposed change in the QL library
✅ here is the model trained on the output of the above dev pipeline and here is its end-to-end evaluation

I therefore propose this PR is good to merge.

Relates to https://github.com/github/ml-ql-adaptive-threat-modeling/issues/1757

jhelie · 2022-03-24T15:53:58Z

(thanks for the rename @owen-mc , will do so next time 👍 )

...ript/ql/experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointData.qll

...t/ql/experimental/adaptivethreatmodeling/test/endpoint_large_scale/EndpointFeatures.expected

jhelie · 2022-04-13T12:14:00Z

Thanks for the review @henrymercer . I agree it's a good idea to add a regression test but I will need your help to implement it. I don't think you have any more office hours this week and then you are on holidays, so do you think we can schedule something this week? I think it'd be good for this PR to be merged now rather than in 2 weeks.

jhelie · 2022-04-13T14:41:42Z

Here is an example of a sink that is currently labelled as both Unknown and NotASink (there are 8 others that can be found in sudheeshshetty-chat when running the test pipeline).

cc @annarailton

…tASink

…cate

jhelie · 2022-04-13T16:19:22Z

@henrymercer : thanks to @annarailton's help I could update the tests. I've added 2 commits: one pre and post fix so that we can easily verify that the change in getAnUnknown modified the outcome of that new regression test (by removing the Unknown label).

I think this addresses all your comments.

henrymercer

Looks great, thanks for implementing that regression test and ✨ to @annarailton for 🍐ing on it!

jhelie requested a review from a team March 24, 2022 12:34

github-actions bot added the JS label Mar 24, 2022

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch 3 times, most recently from 1c8011c to cf820e7 Compare March 24, 2022 14:35

jhelie marked this pull request as draft March 24, 2022 14:36

owen-mc changed the title ~~add defensive check to ensure Unknown endpoints cannot also be NotASink~~ ML: add defensive check to ensure Unknown endpoints cannot also be NotASink Mar 24, 2022

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from cf820e7 to 5abd885 Compare March 28, 2022 10:56

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from 5abd885 to c080ee9 Compare April 8, 2022 11:47

jhelie assigned henrymercer Apr 11, 2022

jhelie marked this pull request as ready for review April 11, 2022 12:46

henrymercer reviewed Apr 11, 2022

View reviewed changes

...ript/ql/experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointData.qll Show resolved Hide resolved

...t/ql/experimental/adaptivethreatmodeling/test/endpoint_large_scale/EndpointFeatures.expected Show resolved Hide resolved

henrymercer removed their assignment Apr 11, 2022

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from c080ee9 to cb88b26 Compare April 13, 2022 11:53

ML: fix ATM expected tests outputs

407a8a7

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from cb88b26 to b46cbad Compare April 13, 2022 12:02

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from b46cbad to 95b2ce3 Compare April 13, 2022 16:08

jhelie added 3 commits April 13, 2022 18:14

ML: add regression test for effective sink that is also NotASink

f2b813a

ML: add defensive check to ensure Unknown endpoints cannot also be No…

f87cd16

…tASink

ML: update regression test output following fix to getAnUnknown predi…

1e39a9c

…cate

jhelie force-pushed the jhelie/enforce-unknown-incompatibiliy-with-notasink branch from 95b2ce3 to 1e39a9c Compare April 13, 2022 16:14

jhelie requested a review from henrymercer April 13, 2022 16:19

henrymercer approved these changes Apr 13, 2022

View reviewed changes

jhelie merged commit d094bbc into main Apr 14, 2022

jhelie deleted the jhelie/enforce-unknown-incompatibiliy-with-notasink branch April 14, 2022 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ML: add defensive check to ensure Unknown endpoints cannot also be NotASink #8546

ML: add defensive check to ensure Unknown endpoints cannot also be NotASink #8546

Uh oh!

jhelie commented Mar 24, 2022 •

edited

Loading

Uh oh!

jhelie commented Mar 24, 2022

Uh oh!

Uh oh!

Uh oh!

jhelie commented Apr 13, 2022

Uh oh!

jhelie commented Apr 13, 2022

Uh oh!

jhelie commented Apr 13, 2022 •

edited

Loading

Uh oh!

henrymercer left a comment

Uh oh!

Uh oh!

ML: add defensive check to ensure Unknown endpoints cannot also be NotASink #8546

ML: add defensive check to ensure Unknown endpoints cannot also be NotASink #8546

Uh oh!

Conversation

jhelie commented Mar 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhelie commented Mar 24, 2022

Uh oh!

Uh oh!

Uh oh!

jhelie commented Apr 13, 2022

Uh oh!

jhelie commented Apr 13, 2022

Uh oh!

jhelie commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrymercer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhelie commented Mar 24, 2022 •

edited

Loading

jhelie commented Apr 13, 2022 •

edited

Loading