Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change task to allow a value for unmatched bonus #91

Merged
merged 5 commits into from
Jun 19, 2022
Merged

Conversation

jonodrew
Copy link
Owner

@jonodrew jonodrew commented May 27, 2022

[2.3.0]

Changed

  • The system now uses pickle under the hood, so please be careful - if you've not secured the connections between
    the machines you're running this on you could really get yourself into trouble.
  • However, it has made it significantly faster - a single matching exercise is now down to just 40s, from a best of
    97s when we were using JSON to serialize data.

Added

  • This is the big one. We've added functionality that will creep up an UnmatchedBonus. This functionality is
    useful if you want to ensure everyone gets at least one mentor. It calculates a lot of values - one client is
    calculating 37 different iterations of a three-round program, requiring 111 rounds of matching - so it takes a bit
    longer to calculate. Exposing this functionality in the front end will be patched very soon, but in the meantime
    dig around in the routes section or add a "pairing": True key-value pair to your JSON call to the appropriate
    endpoint.
  • Given the huge amount of processing happening, this functionality takes a lot longer than you're expecting. It's
    enough time to make several cups of tea - on my hardware, it's clocking in at around 7 or 8 minutes. That's a long
    time to stare at the same screen. We'll be updating the frontend to give more feedback soon, but for the moment,
    either check the logs from celery or accept that you'll be here a little while.
  • A note about the approach: I could have built a system that iterated over potential outcomes sequentially,
    stopping when it got to the approach that scored above a specific threshold. I see two problems with this. First,
    assuming that each matching process takes n seconds, in the worst case iterating upwards takes Mn seconds. In
    the best case, of course, it takes n seconds!
  • My approach batches up the number of approaches into chunks of ten (M/10) that are done simultaneously (Mn/10).
    This is therefore generally faster, although not in the case where the first outcome is the one we want. Given
    that I can't predict things will be perfect every time, I've opted for the apparently longer approach.

@github-actions
Copy link

github-actions bot commented May 27, 2022

✅ Result of Pytest Coverage

---------- coverage: platform linux, python 3.9.13-final-0 -----------

Name Stmts Miss Cover
/app/init.py 16 0 100%
/app/auth/init.py 3 0 100%
/app/auth/routes.py 17 2 88%
/app/classes.py 83 1 99%
/app/config.py 10 1 90%
/app/extensions.py 3 0 100%
/app/helpers.py 71 27 62%
/app/main/init.py 7 1 86%
/app/main/routes.py 118 10 92%
/app/tasks/init.py 10 2 80%
/app/tasks/helpers.py 11 0 100%
/app/tasks/tasks.py 34 0 100%
TOTAL 383 44 89%
======================= 34 passed, 18

@jonodrew
Copy link
Owner Author

jonodrew commented Jun 7, 2022

@johnpeart - I'd love your opinion on this please! Where should we put the form asking for the UnmatchedRule value? I'm thinking we need to insert a whole page...

Another approach I'm thinking of is to have a form that asks something like....

"Would you like to optimize for:

  • everyone getting at least one match, even if some people miss out on great matches
  • the best matches, even if that means more people don't get a match"

And then let the system work out the optimal approach by running multiple trials with different numbers for the unmatched bonus?

What do you think?

@johnpeart
Copy link
Collaborator

Sorry for the delay.

I agree, adding a page in would be a good way of managing this. The page could be expanded, in future, to add other customisable variables.

I also like the idea of giving people human readable options to select, and letting the system working out the best weighting. It seems a more 'user centered' option, to me at least.

Do you need me to do anything specific to enable this?

@jonodrew
Copy link
Owner Author

If we take the second option, I'll need a wee form for the user to select their preferred option. That's all I need from you.

For me, I'll need to work out how to run multiple attempts separately. A very interesting engineering challenge!

@johnpeart
Copy link
Collaborator

I'll have time to put the form together at the weekend, if that works?
I can do the front end and have a go at the JS required to get the value of the selected field.

@jonodrew
Copy link
Owner Author

It shouldn't need any js - as long as we use the design system we can keep using <form>....</form>

I'm going to try and work out the backend this week. We're aiming for the highest total score (measured by summing the successful Match score) that also results in every Mentee getting at least one Mentor

app/classes.py Outdated

def map_model_to_output(self, data: dict):
data["job title"] = data.pop("role")
data["email address"] = data.pop("email")
# data["email address"] = data.pop("email")
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self - fix this

@jonodrew
Copy link
Owner Author

@johnpeart

Forgive the mess of commits - I'm going to redo them so it's clearer what's actually happening here.

The important part is this line - the system now expects a true or false value passed to it. It's where we use the javascript to start the matching process, if you remember?

True will kick off the the divining process, where it spins up every possible iteration of the entire, three-round grid. It's an absolutely mammoth task that takes almost ten minutes to run.

@jonodrew
Copy link
Owner Author

Screenshot 2022-06-18 at 21 40 11

Result of 6 minutes of work on the MacBook - the ideal, mentor-focussed outcome with the sample_data

@johnpeart
Copy link
Collaborator

Quite a lot of this is going over my head!

But if I understand it at a rudimentary level, we want the form that toggles the 'true' or 'false' value to appear on the process.html page template, where I've added the commented out HTML in commit 910534c.

Is that right?

@jonodrew
Copy link
Owner Author

jonodrew commented Jun 19, 2022 via email

@johnpeart
Copy link
Collaborator

That makes sense – though that is probably beyond my knowledge to do the routes, etc. I can make it look nice though!

Capturing the value of a radio button is relatively trivial in JS, so if you want me to do that instead, then I can do that.

@jonodrew
Copy link
Owner Author

jonodrew commented Jun 19, 2022 via email

Moving the requirements installation step towards the beginning of the file means it takes less time to rebuild it when testing, improving speed
@johnpeart
Copy link
Collaborator

johnpeart commented Jun 19, 2022

I've added some very rough routes and a page template which I have put at /options.
It's in #92 and related branch branch.

Hopefully that helps?!

@jonodrew jonodrew force-pushed the set-unmatched-bonus branch 2 times, most recently from 982322e to e9f29c5 Compare June 19, 2022 19:34
Pickle is Python's own serialisation library. It's not without its risks, but it does massively increase speed and enables passing around complex data objects - which is what we have here. It enables us to expose our "most mentees with a mentor" functionality.

To enable this I've added the `connections.setter` property in `CSPerson`, to allow pickle to load the data back into the object when it deserialises it. I've added some tests, but they need expanding - there's likely something around patching the long-running matching tasks to improve visibility of how everything is working.

The `run_task` route now takes a "pairing" variable, which indicates which function will be run to calculate the outcomes. The default is to use the quicker, one-round loop with an unmatched bonus of 6.
Various fixes here that make it easier to test the system locally
@jonodrew jonodrew marked this pull request as ready for review June 19, 2022 19:58
@jonodrew jonodrew merged commit 058d2c2 into main Jun 19, 2022
@jonodrew jonodrew deleted the set-unmatched-bonus branch June 19, 2022 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants