Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in undergraduateDegreeObtainedBystudent #1

Open
hartig opened this issue Oct 18, 2020 · 5 comments
Open

Bug in undergraduateDegreeObtainedBystudent #1

hartig opened this issue Oct 18, 2020 · 5 comments

Comments

@hartig
Copy link
Member

hartig commented Oct 18, 2020

The resolver for undergraduateDegreeObtainedBystudent is incorrect, at least for the cache-variation of the test server.

Look at line 267 of src/cache/resolver/resolver.js. Here, the resolver applies the offset and the limit before applying the filter. That was not the intention. Instead, the filter has to be applied first.

The same problem may exist in the other server variations as well, and maybe also in other resolvers (I have not checked).

/cc @chengsijin0817

@hartig
Copy link
Member Author

hartig commented Oct 18, 2020

Just checked: the resolver for undergraduateDegreeObtainedBystudent in the batch+cache server has the same bug.

@chengsijin0817
Copy link
Member

@hartig I agree. For the resolver implementation, the filter has to be applied first. I can fix this issue.

However, if only considering our existing query template. I think it does not affect the performance of the GraphQL server.
The related query templates are QT7, QT9, QT13, and QT14. For these queries, the filter and limit/offset does not appear in the same template at the same time.

For example, when executing queries of QT13 or QT14, the line line 267 and line 268 will be skipped. Am I understand correctly?

@hartig
Copy link
Member Author

hartig commented Oct 18, 2020

In fact, it is even worse!

The resolver first fetches all the grad students who got their undergrad degree from the given university, no matter whether there is a filter condition or not. After that, offset and limit are applied to the resulting list of all these students. Next, if there is a filter condition, the resolver fetches all the students again, but this time together with data about their advisors. Hence, the first fetch is completely useless and unnecessary work (in fact, the second fetch cannot even benefit from caching the first fetch because these fetches are a bit different--once without advisors and once with advisor data--and, thus, they are in different caches).

But this unnecessary work is not the only problem: After fetching again the whole list of grad students, and filtering them, the resolver does not apply the offset and the limit!!

Besides these issues, there are two identical copies of code that implements the filter. This is such a bad programming style!

Moreover, the implementation is totally inefficient---the filter conditions, as well as the offset+limit, should be pushed into the SQL query instead of fetching all grad students first, and then applying the filter in the GraphQL server.

I assume that all these issues are present in all the server variants :-(

@hartig
Copy link
Member Author

hartig commented Oct 18, 2020

However, if only considering our existing query template. I think it does not affect the performance of the GraphQL server.

It does. See my follow-up comment.

At least, all server variants are affected in the same way (which is different for the bug in #2). In this sense, fixing this issue here is not the highest priority at the moment (in contrast to #2)

For example, when executing queries of QT13 or QT14, the line line 267 and line 268 will be skipped. Am I understand correctly?

These two lines should be moved to before the final return statement. Moreover, the assignment in line 265 should be moved into the main else block (but without let -- see #2). And these changes need to be done for all server variants. Additionally, the main if block can be optimized much more (see my comment above) and the code duplication should be removed in there.

@chengsijin0817
Copy link
Member

These two lines should be moved to before the final return statement. Moreover, the assignment in line 265 should be moved into the main else block (but without let -- see #2). And these changes need to be done for all server variants. Additionally, the main if block can be optimized much more (see my comment above) and the code duplication should be removed in there.

You are right. All these issues are present in all server variants.
I can change the order of offset+limit and filter for all server variants first, and then, I can optimize the code according to your comments.
Currently, I am doing experiments on changing the number of database connections. To make the measurement results be comparable with previous results. I keep this issue at the moment and fix it after finishing that experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants