Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add param to deep GET /proposals/{proposalId} #198

Closed
wants to merge 1 commit into from

Conversation

bickelj
Copy link
Contributor

@bickelj bickelj commented Jan 9, 2023

Add optional param to GET /proposals/{proposalId}

When optional query parameter includeFieldsAndValues is set to true on GET /proposals/{proposalId}, include all proposal versions, associated values, and associated application form fields in the response.

By returning this almost-fully-deep object tree, it is more convenient for the caller and more efficient in terms of request, query, and response counts. The assumption is that the purpose of the PDC is to see and compare which fields are used for what purpose for proposals.

Issue #101 Implement GET /proposals/{proposalId} endpoint

@codecov
Copy link

codecov bot commented Jan 9, 2023

Codecov Report

Base: 96.44% // Head: 96.62% // Increases project coverage by +0.18% 🎉

Coverage data is based on head (01a0e71) compared to base (e8392e9).
Patch coverage: 97.95% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #198      +/-   ##
==========================================
+ Coverage   96.44%   96.62%   +0.18%     
==========================================
  Files          47       47              
  Lines         704      801      +97     
  Branches      125      179      +54     
==========================================
+ Hits          679      774      +95     
- Misses         24       26       +2     
  Partials        1        1              
Impacted Files Coverage Δ
src/handlers/proposalsHandlers.ts 98.65% <97.75%> (-1.35%) ⬇️
src/middleware/errorHandler.ts 98.46% <100.00%> (+0.10%) ⬆️
src/types/Proposal.ts 100.00% <100.00%> (ø)
src/types/ProposalFieldValue.ts 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor Author

@bickelj bickelj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request has two commits. The first is identical to the one in PR #192 and the second commit builds on the first. The assumption is that #192 is relatively uncontroversial because it adds basic functionality. PR #192 still needs review, however, but both changes could be applied in this single PR instead, in which case #192 would be obsoleted. If the additional commit here turns out to require extensive revision, #192 could be merged separately.

@@ -102,6 +105,8 @@ export const errorHandler = (
res: Response,
next: NextFunction, // eslint-disable-line @typescript-eslint/no-unused-vars
): void => {
logger.debug(err);
logger.trace(req.body);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to errorHandler are not really related to the change, but I find myself frequently adding this to the error handler regardless. If desired, I can put this in a separate change.

type: 'object',
/* eslint-disable @typescript-eslint/no-unsafe-assignment */
properties: applicationFormFieldSchema.properties,
/* eslint-enable @typescript-eslint/no-unsafe-assignment */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like to figure out a way to include the application form field schema without having to disable an eslint rule. This seemed to me the least worst option that I could come up with. I am eager to hear about better ways to achieve the goal, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can actually just say:

applicationFormField: applicationFormFieldSchema

Which should solve your problem AND avoid some repeated code!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is much more elegant! Yes!

Unfortunately it doesn't "just work" without some more finagling. I'll figure out what else needs to be updated to get it to work.

Copy link
Contributor Author

@bickelj bickelj Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing null (letting the field be optional) seems to be the sticking point:

Type 'string | undefined' is not assignable to type 'string'.

Edit: if I use readonly applicationFormField: ApplicationFormField | null; above rather than readonly applicationFormField?: ApplicationFormField | null; to add the field to interface ProposalFieldValue then the compiler has no problem with it. But I would have assumed this means it is a mandatory field (e.g. cannot be undefined) but is allowed to be null, and therefore no good. Yet our tests pass. There must be some nuances in the treatment of undefined/optional/null in JS, TS, JSON schema, and/or Ajv.

Copy link
Member

@slifty slifty Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh got ya; this should probably do the trick:

applicationFormField: {
  ...applicationFormFieldSchema,
  nullable: true
}

@bickelj bickelj requested review from slifty and reefdog January 9, 2023 17:33
@reefdog
Copy link
Contributor

reefdog commented Jan 9, 2023

@bickelj Is it too late to change the base branch for this PR from main to 101-add-proposal-by-id-endpoint? When laddering PRs like this, that ensures this one only shows the commits added that aren't in that other commit. (GitHub's also smart enough to change the base on this PR to main once that other PR is merged in.)

If you do change something about the earlier PR, then you'll have to do some cherry-picking and force-pushing and whatnot to keep this one up to date. But that's usually rare and worthwhile.

No insistence here, just alerting you to a process that I've found useful in this scenario! For now, I can just drill into the delta commit (5582675) to view the diff.

@bickelj bickelj changed the title 101 deep proposal by id Add param to deep GET /proposals/{proposalId} Jan 9, 2023
@bickelj
Copy link
Contributor Author

bickelj commented Jan 9, 2023

@bickelj Is it too late to change the base branch for this PR from main to 101-add-proposal-by-id-endpoint? When laddering PRs like this, that ensures this one only shows the commits added that aren't in that other commit. (GitHub's also smart enough to change the base on this PR to main once that other PR is merged in.)

Ah, I probably should have rebased #192 before starting here, I had rebased this one on main in an effort to make the PR #192 easier to review (e.g. less spammy, less volatile with all the force pushing). I should probably merge a bunch of updates to dependencies first and then get it in order. I'll do that.

};
};

const getProposalWithFieldsAndValues = (
Copy link
Contributor

@reefdog reefdog Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a not terribly helpful comment about this method, which is: I spent quite a lot of time reading through it and still couldn't figure out why it was so complicated. It seems like it should be more concise to simply collect all the versions into an array, and yet…

I trust that there is a reason for this verbosity and some of the gymnastics (like the dummy currentProposalVersion that gets replaced as needed). I think I would suggest that you do these things:

  1. Consider if this method of collecting proposal versions can be simplified / made more legible.
  2. If not, and/or if any improvements still aren't self-documenting, add more comments about why we're doing things.
  3. Consider if anything can be abstracted out to named utility methods — even if not to reduce redundancy, just to have a clearer and more readable flow.

I leave it up to you to decide if that's worthwhile, but the long-term concern here is that this method may be difficult to review and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a helpful comment, thank you! One thing that comes to mind is the optional nature of the deeper objects and the mandatory (non-null) requirements of some of our types. I think the combination of those might explain some of it. But I am also unsatisfied with the way this one turned out and wish for a better way. One utility method is present, namely getValueWithFullFieldFromRow but there is still plenty to be desired here.

@bickelj
Copy link
Contributor Author

bickelj commented Jan 9, 2023

049bbd8 rebases on 101-add-proposal-by-id-endpoint

@bickelj
Copy link
Contributor Author

bickelj commented Jan 11, 2023

FWIW, I purposely aimed for one database call per one HTTP request here.

When optional query parameter includeFieldsAndValues is set to true on
GET /proposals/{proposalId}, include all proposal versions, associated
values, and associated application form fields in the response.

By returning this almost-fully-deep object tree, it is more convenient
for the caller and more efficient in terms of request, query, and
response counts. The assumption is that the purpose of the PDC is to
see and compare which fields are used for what purpose for proposals.

Issue #101 Implement GET /proposals/{proposalId} endpoint
@bickelj
Copy link
Contributor Author

bickelj commented Jan 16, 2023

Now that the related #192 has been merged, 01a0e71 rebases on main and this is one commit rather than two.

@slifty
Copy link
Member

slifty commented Jan 16, 2023

Heya -- copying this here from a post I made in Zulip (please forgive any context translation / conversational norm differences between mediums!):

O(1) database calls is the metric I'd like us to target (rather than literally 1); one call per type of data returned.

I agree we need to avoid any situation where the number of SQL calls is arbitrary, but I think the cost associated with trying to do just a SINGLE call is incredibly high (both in terms of added code / complexity as well as technical debt, but also in that we're now re-implementing a little extra of what a database is already optimized to do in terms of data packaging and parsing).

So this would look like:

one new SQL query per data type (getProposalVersionsByProposalId, getProposalFieldValuesByProposalId).

Then we can use the existing schema validators to parse the results of those queries, do a quick loop on their results to put them in the correct buckets, and be happy!

I'd be happy to do an example implementation to show what I mean.

@bickelj
Copy link
Contributor Author

bickelj commented Jan 16, 2023

The good news is it sounds like the API design is agreeable, namely the idea of a deep GET on /proposals/{proposalId} and the structure of that response looks OK.

The bad news is the implementation leaves some things to be desired, namely the idea of one SQL query per web request with several joins followed by rolling up the data in TypeScript.

@slifty said:

I agree we need to avoid any situation where the number of SQL calls is arbitrary, but I think the cost associated with trying to do just a SINGLE call is incredibly high (both in terms of added code / complexity as well as technical debt, but also in that we're now re-implementing a little extra of what a database is already optimized to do in terms of data packaging and parsing).

Can you elaborate a bit on the high costs as compared to the alternative?

  • Added code
  • Added complexity
  • Added technical debt
  • Re-implementing what a db does

@slifty also said:

one new SQL query per data type (getProposalVersionsByProposalId, getProposalFieldValuesByProposalId).

Then we can use the existing schema validators to parse the results of those queries, do a quick loop on their results to put them in the correct buckets, and be happy!

I hear what you're saying but this wouldn't technically be O(1) because the last query is actually by proposal version id, so it would be O(N) where N is the count of proposal versions in a given proposal. I also see this as implementing a join in code rather than letting the DB do the join. It would be more efficient in terms of data per row, total data sent from db to service, for sure, but less efficient in terms of round trips (query count).

@slifty
Copy link
Member

slifty commented Jan 17, 2023

This may answer the question, and if not maybe my best bet would be to show via partial implementation / pseudocode.

The approach I'm proposing would involve:

  1. No new types
  2. No new validation scripts
  3. Three new .sql files
  4. Two utility methods for "decorating" some of the parent objects with children.

It would truly be O(1), specifically it would involve three separate queries instead of one single query, which I don't think reflects a particularly high burden from a computational POV. getProposalFieldValuesByProposalId would be indeed by ProposalId and the result would be all field values for all proposal versions associated with that proposal id.

I'll take a stab at explaining better (probably will require code examples to really show what I'm talking about) tomorrow!

@bickelj
Copy link
Contributor Author

bickelj commented Jan 17, 2023

I see the approach in my mind's eye. I assume you're doing a join ingetProposalFieldValuesByProposalId which indeed would make it O(1) (edit: with regard to sql query count).

Laying out the constraints as you did helps. My constraints were quite a bit different:

  1. Exactly one SQL query and
  2. Exactly one iteration over the rows resulting from that query.

Because this is not going to be the only place we get deep associations I think getting it right this time is worth it. I agree that the resulting several-dozen line function is awkward and that the additional types add some cost when we consider this as a precedent to be re-used in other situations. I can't see the cost as incredibly high: it is what it is given the constraints. I agree that changing the constraints can yield a different result in different areas. Since you have a very specific implementation in mind, it will probably be easiest to show it, but I am willing to try it if you like.

@bickelj
Copy link
Contributor Author

bickelj commented Jan 17, 2023

Another thought: suppose we don't validate our own database result rows with a type, that would avoid the creation of those types/validations, etc. (Edit: on second thought this would only push the problem back a bit)

Copy link
Member

@slifty slifty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were comments that apparently didn't get published so hitting send now!

});

it('returns one proposal with deep fields when includeFieldsAndValues=true', async () => {
// Needs canonical fields,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts:

  1. Can we take out these comments at this point

  2. sigh -- some day we might consider adding some kind of fixtures / tooling to support the population of the database directly... (and we probably should at least be starting to use the tools we have instead of writing SQL directly). This sub-point is for future, not for this PR directly.

type: 'object',
/* eslint-disable @typescript-eslint/no-unsafe-assignment */
properties: applicationFormFieldSchema.properties,
/* eslint-enable @typescript-eslint/no-unsafe-assignment */
Copy link
Member

@slifty slifty Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh got ya; this should probably do the trick:

applicationFormField: {
  ...applicationFormFieldSchema,
  nullable: true
}

@bickelj
Copy link
Contributor Author

bickelj commented Feb 2, 2023

I hereby officially withdraw this PR in favor of #210 which supercedes it and is much simpler.

@bickelj bickelj closed this Feb 2, 2023
bickelj added a commit that referenced this pull request May 31, 2023
Without this change, a non-deterministic sort order can cause an HTTP
500 response with message "...values and fields must be sorted...".

With this change, the sort order is first by position (as before) and
then also by proposal field value (pfv) ID because there can be many
pfv rows for a given application form field (aff) row. This should be
sufficient to guarantee deterministic order such that the application
join will succeed.

Issue #381 Unexpected 500 response: "values and fields must be sorted"

See also #198
and #210.
bickelj added a commit that referenced this pull request Jun 1, 2023
Without this change, a non-deterministic sort order can cause an HTTP
500 response with message "...values and fields must be sorted...".

With this change, the sort order is first by position (as before) and
then also by proposal field value (pfv) ID because there can be many
pfv rows for a given application form field (aff) row. This should be
sufficient to guarantee deterministic order such that the application
join will succeed.

Issue #381 Unexpected 500 response: "values and fields must be sorted"

See also #198
and #210.
@slifty slifty deleted the 101-deep-proposal-by-id branch November 1, 2023 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants