-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nextRecord value when maxRecords="0" #22
Comments
Further information from mailing list:
|
I also do not see the reason why "csw:SearchResults/@nextRecord = 1" is expected. |
@lorebiga Can you please take a look at the reported behaviour? |
Any further on this? I'm not sure of the lifecycle of an issue in this ecosystem, but it's been 3 months since the last activity. |
@lorebiga What do you think about this? |
I've finally been able to investigate this issue. The test query defines a result set of 11 records, of which none is returned (since maxRecords=0). Hence, the response should specify:
As clarified at p. 159, "the nextRecord attribute will indicate to the client where to begin the next query" (to iterate the result set), so:
I hope this is clear. I'm closing this issue. By the way, I've noticed that the documentation of test tc23.1 includes the condition that numberOfRecordsMatched > 10, but the test implementation actually tests for numberOfRecordsMatched > 1. Both could be amended to align with the other. I've decided to amend the test implementation. I've also noticed a typo in 07-006r1, §10.8.4.7, p. 149: the mention to "subclause 10.8.4.2" should instead refer to "subclause 10.8.4.3". I've reported it to the OGC staff. |
If you think that this is the definite answer to the question, I can patch my code to comply. To me, it still doesn't make complete sense that this operation, which returns information on all the records, would then assume that the next record to harvest is the first one in the catalogue; it would probably be safer to report next_record=0, not to then potentially mislead a client into potentially re-harvesting the whole catalogue from the start. |
I'm not sure I understand. This operation doesn't return information on all the records, the response refers to the result set identified by the query, which may be a subset of the entire catalogue. |
That's a fair point. So to be precise, this is that information on all
records in the query have been returned.
If I've understood correctly, if resultType="hits" (or equivalently
"maxRecords"=0) then "nextRecord" is superfluous as the response of the
query doesn't 'page'. It's function when resultType="results", is to allow
paging of the query. From Table66 of the spec (p159), in the description
of "numberOfRecordsReturned":
*"Subsequent queries may be executed to see more of the result set. The
nextRecord attribute will indicate to the client where to begin the next
query."*
Is there a situation where there would be a further repeated
resultType="hits" query with nextRecord=X to complete the client's task? I
don't think it makes sense, but I'm willing to hear otherwise. Either way,
if the response has given information on the last of the records in the
query, it should not tell the client which record is next to query to
complete its job.
Does this make sense? I think it boils down to: there are no more records
to query, so don't set nextRecord to a natural number.
Rg.
…On 13 March 2018 at 13:09, Lorenzo Bigagli ***@***.***> wrote:
I'm not sure I understand. This operation doesn't return information on
all the records, the response refers to the result set identified by the
query, which may be a subset of the entire catalogue.
The meaning of nextRecord=0 is clearly defined by the specification (p.
159): "a value of 0 means all records have been returned". In this case, no
record has been returned, so 0 would be wrong.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAU-IIQb0XV1Jtf8xXk0cQmT-HufUkJVks5td8UTgaJpZM4RCk5g>
.
|
The nextRecord parameter is not applicable to requests, but to responses.
This protocol was designed by the authors of the CSW 2.0.2 specification, based on the requirements collected. Afaik, it was retained as is also in CSW 3.0. |
Not to lose information, I'll respond in-line.
The nextRecord parameter is not applicable to requests, but to
responses.*
This is not being debated. I think I know why you said it, so to be clear
I should've said: "Is there a situation where there would be a further
repeated resultType="hits" query with startPostion=nextRecord to complete
the client's task?".
A client may of course repeat a query with a resultType=hits parameter. *
My question was: Is there a situation where there would be a further
repeated resultType="hits" query [...] ***to complete the client's
task***? Does startPosition have an effect on resultType="hits"? What
about "maxRecords"? If so, but I don't think it does in the PyCSW
implementation, in the case where the catalogue size is 100,
startPosition=10 and maxRecords=50, then would nextRecord=61 be returned
with the response? What about startPosition=90 and maxRecords=50?
Every time, it will get no records in the response, and nextRecord=1. *
Yes, it should get an empty SearchResults, with attributes set to reflect
how many 'hits' the query matches. What nextRecord should be is what I'm
contending, of course; I believe nextRecord=1 is incorrect in this context.
In general, it works like this:*
the client sends a request with a query*
the server computes the query result (a list of 1... n records)*
Could be zero records, if the catalogue is empty for one case.
the server sends a response with some of the records (possibly 0),
specifying the index of the next in the list (i.e. the first record which
is not returned) in the nextRecord paramete*r
To get 0 records from a resultType="results", you either set "maxRecords" =
0, which is specifically stated to operate as a resultType="hits", or there
are no records.
However, I don't see "nextRecord" = "the first record which is not
returned" in the spec. I do see the description for numberOfRecordsReturned
stating:
"Number of records actually returned to client. This may not be the entire
result set [...]. Subsequent queries may be executed to see more of the
result set. The nextRecord attribute will indicate
to the client where to begin the next query."
The suggestion is that, in a response, if nextRecord > 0, there are more
records in the result set, from the request made, which are still to be
returned and that another request is required. In the case of a
resultType='hits' this is not the case; the client should be given a clue
that it's finished it's job and it can relax.
This protocol was designed by the authors of the CSW 2.0.2
specification, based on the requirements collected. Afaik, it was retained
as is also in CSW 3.0.*
Are you an author? If not, would it be worth contacting the authors? I
would argue that this should be discussed for any future iteration of the
specification, to make it crystal clear that this is the expected return
value of nextRecord for resultType='hits' (or equivalent), because if
nextRecord=1 for a non-paging response seems incorrect, but the current
spec text doesn't convince me that it's by design.
To me, it's clear that the nextRecord attribute is for use where a response
does not completely fulfil the request given and used to page the records
in the response and prompt further requests for the remainder of a result.
Cheers, Rg.
|
I contributed to CSW 3.0, but I don't remember a specific discussion around this very point. In general, I think that comments on approved specification should be formulated as Change Requests, which are directed to the appropriate Working Group, or may trigger the formation of a new Revision Working Group. In fact, arguing about the rationale of CSW specification is out of scope, in this context. We are only concerned abut the correctness of existing compliance tests. This said, I will try to answer your questions :-) - Is there a situation where there would be a further repeated resultType="hits" query [...] to complete the client's task? - Does startPosition have an effect on resultType="hits"? What about "maxRecords"? If so, but I don't think it does in the PyCSW implementation, in the case where the catalogue size is 100, On the syntax and semantics of nextRecord: it is mandatory, it is a non-negative integer, and the specification is very clear on the meaning of the expression nextRecord=0: "a value of 0 means all records have been returned". So, in the specific case of the tc23.1 test, how should a server behave? |
I think that you're putting too much faith in the wording of the
description of nextRecord in Table 66.
So, the wording is:
"A value of 0 means all records have been returned".
But take, for example, a request with resultType='results' and
startPosition > 1, where the catalogue has fewer records than maxRecords.
The response will have nextRecord = 0 but not "all records have been
returned". The words do not reflect what is happening. I would suggest
the intended meaning of the description of nextRecord is (parenthesised
text my own, obviously):
"A value of 0 means all records [in the result set relating to the request]
have been returned."
Now take the example of a request with resultType='hits', the result set is
defined to be empty - i.e. zero records; so, when all records (i.e. zero
records) have been returned, nextRecord should be set accordingly (i.e.
nextRecord=0).
Thinking about the impact of changing this test, would it be a huge problem
of backwards compatibility? I don't mind doing some of the work on this,
if it helps. You advice is welcomed.
All the best, Roger
…On Tue, 13 Mar 2018, 19:09 Lorenzo Bigagli, ***@***.***> wrote:
I contributed to CSW 3.0, but I don't remember a specific discussion
around this very point. In general, I think that comments on approved
specification should be formulated as Change Requests, which are directed
to the appropriate Working Group, or may trigger the formation of a new
Revision Working Group.
In fact, arguing about the rationale of CSW specification is out of scope,
in this context. We are only concerned abut the correctness of existing
compliance tests. This said, I will try to answer your questions :-)
*- Is there a situation where there would be a further repeated
resultType="hits" query [...] to complete the client's task?*
Perhaps a polling use-case? Anyway, we should not make assumptions about
client's tasks, in this context.
*- Does startPosition have an effect on resultType="hits"? What about
"maxRecords"? If so, but I don't think it does in the PyCSW implementation,
in the case where the catalogue size is 100, startPosition=10 and
maxRecords=50, then would nextRecord=61 be returned with the response? What
about startPosition=90 and maxRecords=50?*
The specification seems vague on this, and I don't think there's any
specific test. However, note that maxRecords is the maximum number of
records that should be returned. A server may return less. The nextRecord
parameter may be used by the client to continue fetching the result set
from the first record it hasn't got in that response.
On the syntax and semantics of nextRecord: it is mandatory, it is a
non-negative integer, and the specification is very clear on the meaning of
the expression nextRecord=0: "a value of 0 means all records have been
returned".
So, in the specific case of the tc23.1 test, how should a server behave?
It must not return any of the 11 records in the result set, since the
client required maxRecords=0.
But it must return a nextRecord with some value >= 0.
That cannot be 0, because it would imply that all the 11 records be in the
response, which instead must be empty.
Could that be 2... 11? The specification does not state it clearly, but it
is logical to imply that some records would be in the response, in that
case, whereas it must be empty.
By setting nextRecord=1, the server indicates to the client that it can
continue fetching the result set from the first record in the result set.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAU-IKTpRQNHhPM7PA6fQ7P8ruLZ531Qks5teBltgaJpZM4RCk5g>
.
|
It's not a matter of faith: in this context, we have to stick to the wording of the specification and to the existing scripts. Tc23.1 needs no change, as far as nextRecord is concerned. Your examples are out of scope, though meaningful and interesting. I agree with you on the meaning of nextRecord. In my opinion, the current wording is clear enough: all the parameters in a response refer to the result set related to the request. |
we have to stick to the wording of the specification
Which is why I've discussed the wording of the specification, which does
not clearly support what you believe it to support.
Your examples are out of scope
What do you mean by this? They discuss requests and responses using
GetRecords within the CSW 2.0.2 specification. That's exactly in scope,
surely!?
I agree with you on the meaning of nextRecord.
Nice to know ;¬)
In my opinion, the current wording is clear enough: all the parameters in
a response refer to the result set related to the request.
Which agrees with what I said about the wording not being clear, but if you
take the meaning of the wording as it is, then nextRecord=0 is the correct
response for a resultType='hits' request (as there are no more records to
return).
If it makes it easier to swallow, I'll put in a formal Change Request to
clarify the wording in the specs to prevent cross-interpretation of the way
it's been written.
Do you have any suggestions to an improved wording to suggest? I'll also
give it some thought.
Rg.
…On 14 March 2018 at 15:12, Lorenzo Bigagli ***@***.***> wrote:
It's not a matter of faith: in this context, we have to stick to the
wording of the specification and to the existing scripts. Tc23.1 needs no
change, as far as nextRecord is concerned.
Your examples are out of scope, though meaningful and interesting. I agree
with you on the meaning of nextRecord. In my opinion, the current wording
is clear enough: all the parameters in a response refer to the result set
related to the request.
However, the change you propose is acceptable. You may submit a formal
Change Request.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAU-IILdGzWcJsTjv-IwLvolouBWpYPbks5teTN3gaJpZM4RCk5g>
.
|
Incidentally, would there have been a faster way to get a change to the
test, other than a Change Request against the spec?
…On 14 March 2018 at 15:30, Roger Duthie ***@***.***> wrote:
> we have to stick to the wording of the specification
Which is why I've discussed the wording of the specification, which does
not clearly support what you believe it to support.
> Your examples are out of scope
What do you mean by this? They discuss requests and responses using
GetRecords within the CSW 2.0.2 specification. That's exactly in scope,
surely!?
> I agree with you on the meaning of nextRecord.
Nice to know ;¬)
> In my opinion, the current wording is clear enough: all the parameters
in a response refer to the result set related to the request.
Which agrees with what I said about the wording not being clear, but if
you take the meaning of the wording as it is, then nextRecord=0 is the
correct response for a resultType='hits' request (as there are no more
records to return).
If it makes it easier to swallow, I'll put in a formal Change Request to
clarify the wording in the specs to prevent cross-interpretation of the way
it's been written.
Do you have any suggestions to an improved wording to suggest? I'll also
give it some thought.
Rg.
On 14 March 2018 at 15:12, Lorenzo Bigagli ***@***.***>
wrote:
> It's not a matter of faith: in this context, we have to stick to the
> wording of the specification and to the existing scripts. Tc23.1 needs no
> change, as far as nextRecord is concerned.
>
> Your examples are out of scope, though meaningful and interesting. I
> agree with you on the meaning of nextRecord. In my opinion, the current
> wording is clear enough: all the parameters in a response refer to the
> result set related to the request.
> However, the change you propose is acceptable. You may submit a formal
> Change Request.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#22 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAU-IILdGzWcJsTjv-IwLvolouBWpYPbks5teTN3gaJpZM4RCk5g>
> .
>
|
Not that I know of. Change Requests are the OGC mechanism to address cases like this. Sorry I can't help you more on this, I'm already overcommitted with maintaining the tests... :-) |
Are you the only person working on this? Do you maintain the whole test suite or just the CSW tests?
Well, apart from fixing the test with the current wording! ;¬) Given that the Change Request is for the benefit of making sure you're happy with what the spec says, can I run this by you. If I were to suggest the simple change to the description on table 66 for nextRecord to read: This covers the case where the result set naturally has no records, plus (I hope you agree) the cases where {resultType='results', maxRecords=0} and {resultType='hits'}. If you think it is still not clear enough, please suggest what would be. I have a feeling that this issue will not be as protracted if well thought out solutions are presented up front. |
I'm the only maintainer of the CSW 2.0.2 ETS (and I maintain only this ETS). As far as I'm concerned (i.e. these tests), I don't think the spec needs any change: I'm already happy with what it says :-) |
Hi Lorenzo,
I'm not sure I understand your position on this. You say you agree with the argument that nextRecord is not correct when resultType='hits' (or equivalent), but won't fix the problem because of your interpretation of the currently worded spec; however, also do not think that the spec should be changed.
It also seems that you are the gatekeeper on correcting the test to reflect the correct behaviour. So, at least for me, this is a little more than frustrating.
Can you please clarify your position? Otherwise, there could be a lot of wasted effort in trying to follow your suggestion without any possible gain.
I would like to point out that I have only come across one person who thinks this test (i.e. the expected behaviour of the response) is correct and that person is yourself. Is there anyone else who maintains this test suite, from who we can get an outside opinion?
If you are trying to save some work for yourself, I am willing to put some time in myself to help. Just let me know what needs to be done. However, changing the wording in the spec won't save work, in the long run, with a lot more people-hours being put into it and the likely result that this test will have to be fixed anyway.
I look forward to hearing from you further.
Roger
…________________________________
From: Lorenzo Bigagli <notifications@github.com>
Sent: 15 March 2018 13:20:52
To: opengeospatial/ets-csw202
Cc: Duthie, Roger J.A.; Manual
Subject: Re: [opengeospatial/ets-csw202] nextRecord value when maxRecords="0" (#22)
I'm the only maintainer of the CSW 2.0.2 ETS (and I maintain only this ETS).
As far as I'm concerned (i.e. these tests), I don't think the spec needs any change: I'm already happy with what it says :-)
Go ahead with the Change Request as you prefer, sorry I can't help you more on this.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#22 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAU-IBQeCPFjYwkLaq0TQ35ImO9AQ1vjks5temq0gaJpZM4RCk5g>.
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
|
Roger, I think we are repeating ourselves here. I am the maintainer of the CSW 2.0.2 ETS. You can report and refer to me on issues related to the CSW 2.0.2 ETS implementation. I am not the right recipient for arguing the rationale and the design of the CSW specification in general, nor to improve the CSW specification. Please, don't refer to me on Change Requests to the CSW specification. I think test tc23.1 is correctly implemented. I quote from above: the test query defines a result set of 11 records, of which none is returned (the request specifies maxRecords=0). Hence, the response should specify: numberOfRecordsMatched="11" As clarified at p. 159, "the nextRecord attribute will indicate to the client where to begin the next query" (to iterate the result set), hence: nextRecord="1" The CSW specification has been perused many times, this ETS has been executed hundreds, maybe thousand times, and no one has raised this issue before. I respect your opinion, but I disagree with it: I think that test tc23.1 is correctly implemented wrt nextRecord value, and there is nothing to fix. I hope you can accept my decision and agree on closing this discussion, for the time being. |
Thank you for you patience while I try to make my point clear. If I submit a Change Request, I will definitely refer to this discussion in my description, as this is the whole reason for the Change Request. I've already written a draft for the request, in fact. It doesn't have your name in it, but refers to 'test maintainers', etc. That this issue hasn't been raised does not mean the behaviour is correct. The case in hand would have little exposure, I would expect, because in the use cases for GetRecords requests with resultType='hits' or equivalent, you would normally ignore the nextRecord attribute (... because you have all the records you expect to have!!! ;¬) ). So, the only people who are going to be concerned about these things are the implementation devs, who are not numerous. I expect a lot of little issues are ignored because of barriers to resolution, reading the spec, understanding it, etc. It's probably easier to change your code to match the test suite. You should definitely not do things the way they have been done before, if they have been done incorrectly. Yes, at danger of repeating ourselves:
To me, the above statement is contradictory to your point: the response doesn't need to iterate the result set, it has all the records it asked for (i.e 0). It's maybe too late to convince you, but it's worth noting for the discussion that the [currently worded] spec also states in the "maxRecords attribute" section: You don't seem be compromising on this, so the next step would be to create the Change Request. The purpose of the Change Request would be to simply to clarify that in the case where no records were requested (or even that no records are present in the query result) and no records are returned, the nextRecord value doesn't tell the client to start the next query at record number 1. Instead, it should be crystal clear that the nextRecord attribute should be set to 0, to signify that the records included in the response (i.e. 0 records) fulfils the entirety of the request. I am still certain that this is what the current wording of the spec actually says. Maybe @dstenger has an outside opinion to offer? |
This is to comply with the current ETS-CSW test suite - cf., opengeospatial/ets-csw202#22
* Code now considers startposition when checking returned records Previously, the code would check if maxrecords was greater than the matched records, but did not consider that the startposition might be non-zero. This is now addressed for different cases of startposition, maxrecords and matched. Code change in csw2.py only, to be tested. The code in csw3.py also needs modified in a similar way. * Functional test post_GetRecord-end expected XML corrected The numberOfRecordsReturned in the expected XML for the test default/post/GetRecords-end.xml was inconsistent with the number of records in the expected XML. This test failed when the code generating the numberOfRecordsReturned attribute was updated. Both the code and the expected XML for this test now agree and reflect the correct value. * csw3 code uses startposition in calculation of numberOfRecordsReturned * numberOfRecordsReturned and nextRecord attributes good for maxRecords=0 The case where maxRecords=0 means that the nextRecord parameter should be set to zero, to indicate that information on all records has been returned. * An amend to last commit, without the 'amend': fix to introduced problem - a * CITE test with maxRecords=0 now returns nextRecord=0 Although resultType=results, with maxRecords=0, the response should have nextRecord=0, signifying that information on all the records have been returned. The expected XML for this CITE test has been modified to check against a response with nextRecord=0 (instead of 1). * If 'maxrecords' = 0, but matched records > 0, 'nextRecord' = 1 This is to comply with the current ETS-CSW test suite - cf., opengeospatial/ets-csw202#22 * Rolled back change to pycsw functionaltests to match ets * CSW3 functionaltests dictate that nextrecord='0' for maxrecords='0' Although the CSW2 behaviour is for nextrecord='1', because of the policy of the ets-csw202 test suite, the CSW3 functionaltests in the pycsw repo prescribe that nextrecord='0'. This has been so for over a year, so I'm not going to change the test. The CSW2 behaviour might still be brought into line with CSW2, as there is no reason for it not to (apart from the assertion of the ets-csw202 test maintainer that the standard disallows it).
* Code now considers startposition when checking returned records Previously, the code would check if maxrecords was greater than the matched records, but did not consider that the startposition might be non-zero. This is now addressed for different cases of startposition, maxrecords and matched. Code change in csw2.py only, to be tested. The code in csw3.py also needs modified in a similar way. * Functional test post_GetRecord-end expected XML corrected The numberOfRecordsReturned in the expected XML for the test default/post/GetRecords-end.xml was inconsistent with the number of records in the expected XML. This test failed when the code generating the numberOfRecordsReturned attribute was updated. Both the code and the expected XML for this test now agree and reflect the correct value. * csw3 code uses startposition in calculation of numberOfRecordsReturned * numberOfRecordsReturned and nextRecord attributes good for maxRecords=0 The case where maxRecords=0 means that the nextRecord parameter should be set to zero, to indicate that information on all records has been returned. * An amend to last commit, without the 'amend': fix to introduced problem - a * CITE test with maxRecords=0 now returns nextRecord=0 Although resultType=results, with maxRecords=0, the response should have nextRecord=0, signifying that information on all the records have been returned. The expected XML for this CITE test has been modified to check against a response with nextRecord=0 (instead of 1). * If 'maxrecords' = 0, but matched records > 0, 'nextRecord' = 1 This is to comply with the current ETS-CSW test suite - cf., opengeospatial/ets-csw202#22 * Rolled back change to pycsw functionaltests to match ets * CSW3 functionaltests dictate that nextrecord='0' for maxrecords='0' Although the CSW2 behaviour is for nextrecord='1', because of the policy of the ets-csw202 test suite, the CSW3 functionaltests in the pycsw repo prescribe that nextrecord='0'. This has been so for over a year, so I'm not going to change the test. The CSW2 behaviour might still be brought into line with CSW2, as there is no reason for it not to (apart from the assertion of the ets-csw202 test maintainer that the standard disallows it).
I have now filled out a change request for clarification on this issue in the CSW spec: http://ogc.standardstracker.org/show_request.cgi?id=547 |
As suggested in the OGC CITE thread, opening this issue to address
maxRecords=0
behaviour. Associated pycsw ticket at geopython/pycsw#557 (comment) (CITE trace at geopython/pycsw#557 (comment))The text was updated successfully, but these errors were encountered: