Match runner overview #4

devonwalshe · 2020-09-16T02:51:31Z

I've set up the endpoints that will allow the user to upload files and launch a match between the two files, and then output the results of the match to a CSV. I'm going to discuss the endpoints here so we can work together on tying it all up.

Before you read this - please review #2 to refresh yourself on the user flow (we are dropping item 5 for now).

I imagine you'll have questions so let me know and I'll update here and in Slack so we're on the same page.

Short version:

POST to /pipeline/ - make a new pipeline
POST to /raw_file/ - uploads files - do this 2 times with additional form data (discussed below)
POST to /run_match/ - make a new run match with all of its related data
POST to /matchrunner/<runmatch_id>/ to launch a data match on the files
GET to /matchrunner/<runmatch_id>/export to return the data as CSV text, which you'll then need to package into a file for the user to download.

More detail:

Pipeline
- This is the top of the food chain for the application, the resources InspectionRun and RunMatch both require a pipeline as their starting reference point.
- POST to /pipeline/ with a name parameter, thats it.
RawFile
- Input dataset. We need two of these to set up a run match, I think the page where we start a new run match should start with a form where you enter two files, and all the necessary data, but actually send two post requests (we can discuss this more)
- A POST request to /raw_file/ now requires additional data attributes, sent as a multi-part form:
  - file - file upload,
  - source text input,
  - data_mapping_id - dropdown (endpoint /feature_maps/),
  - pipeline_id - dropdown,
  - run_date - text-input,
  - sheet_name - text input,
  - source- text input
- You will get the corresponding InspectionRun id (gets created automatically) in the response from the post. I can make it available elsewhere if you need but you could also traverse backwards from GET /inspection_runs/ which lists raw_file_id's
InspectionRun
- An Inspection Run is created automatically (this is important because the RunMatch references the inspection runs directly, not the raw files).

you need two inspection run ids to create a run match

RunMatch
- The parent for all our match data
- After we upload 2 RawFiles, it will generate 2 corresponding InspectionRun's that we can use to generate a RunMatch.
- A POST to RunMatch requires just a name, run_a (earlier InspectionRun), run_b (later InspectionRun), and a pipeline_id
- A RunMatchConf record is created in the background with my standard defaults, but we should allow the user to tweak the launch of the matcher
MatchRunner
- endpoint at POST /match_runner/<run_match_id> to launch the match for the run
- I tried to make it run in the background but need more work, DB was causing connection issues.
- This will be a long running request, matching the data in the background.
- I've added some narrative info to the GET /run_match/1 endpoint
MatchExporter
- endpoint at GET /matchrunner/<run_match_id>/export
- We should have an export CSV button for this in two places - on the matching interface and on the run_match listings.
- Another long running process, assembles all the data from the database for the runmatch and outputs a CSV as text.
- The user should receive a file download prompt.
- I figured there wasn't any point sending a file which you would have to handle on your end, so decided on text - let me know if that works

The text was updated successfully, but these errors were encountered:

sheinin · 2020-09-16T08:21:10Z

Pipeline

Do you see it as a grid listing + "add new" interface?

RawFile

Sounds like one screen with two identical input sections for files 1 and 2, and the submit button that sends two separate POST requests.

InspectionRun

A display-only grid of "/inspection_runs"?

RunMatch

I understand that this is our main interface, that now opens with the button "New" and the grid, will have an input form as described:

A POST to RunMatch requires just a name, run_a (earlier InspectionRun), run_b (later InspectionRun), and a pipeline_id

MatchRunner

Is this a grid of run matches with the "launch" button and job status? Why not add "run" + status to the screen 4.?

I've added some narrative info to the GET /run_match/1 endpoint

Is it used in the UI?

devonwalshe · 2020-09-16T23:42:49Z

Pipeline
Do you see it as a grid listing + "add new" interface?

Yes please!

RawFile

Sounds like one screen with two identical input sections for files 1 and 2, and the submit button that sends two separate POST requests.
Yes with one page, but we also need to POST the other form data for each raw file, which I've discussed above.

InspectionRun
A display-only grid of "/inspection_runs"?

I was putting it above for reference, not for frontend implementation - the users don't need to see this for now. However you need the ID of the 2 inspection runs (run_a, run_b) in order to create the run_match as you mentioned below.

RunMatch

I understand that this is our main interface, that now opens with the button "New" and the grid, will have an input form as described:
A POST to RunMatch requires just a name, run_a (earlier InspectionRun), run_b (later InspectionRun), and a pipeline_id

Precisely!

MatchRunner
Is this a grid of run matches with the "launch" button and job status? Why not add "run" + status to the screen 4.?

The runmatch matching launch should take place on the page after you upload the rawfiles - where we currently have the configuration form. The default data for the configuration form should be pulled from the run_match/<id> or run_matches - the conf is all included there now.
The best place to launch the export I think is in the grid panel for the run match listings - just add a button to it saying "export data"

I've added some narrative info to the GET /run_match/1 endpoint

Is it used in the UI?

I think you can use the pipe_section_count and sections_checked datapoints

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Match runner overview #4

Match runner overview #4

devonwalshe commented Sep 16, 2020 •

edited

Loading

sheinin commented Sep 16, 2020 •

edited

Loading

devonwalshe commented Sep 16, 2020 •

edited

Loading

Match runner overview #4

Match runner overview #4

Comments

devonwalshe commented Sep 16, 2020 • edited Loading

Short version:

More detail:

sheinin commented Sep 16, 2020 • edited Loading

devonwalshe commented Sep 16, 2020 • edited Loading

devonwalshe commented Sep 16, 2020 •

edited

Loading

sheinin commented Sep 16, 2020 •

edited

Loading

devonwalshe commented Sep 16, 2020 •

edited

Loading