-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FETCH all data? #9
Comments
There is a FETCH with no arguments allowed. I am guessing that is what this does, but should be clarified in the docs. You would presumably still have to do a STATION and a SELECT, but those could be wildcarded I think. |
This is for "DATA [seq]" but "FETCH [seq]" has the same meaning for seq. The "-1" is a magic word (number) to mean the next packet, but there isn't a magic number/word to indicate the first available packet. |
This seems like a problem, since FETCH is supposed to not wait for any additional new data. So a FETCH -1 would finish immediately with no data returned by that definition I think? Perhaps for FETCH it should be:
|
@andres-h can you comment on the expected behavior for |
Looks like this was lost in the description, but The purpose of FETCH is cyclic transmission (originally for dial-up links). If I'm not sure if fetching all data blindly would be useful. The legacy SeedLink does have (probably undocumented) |
Even if A bigger question: what is the use case for
Slightly off topic. I either never knew this or forgot, so no support in libslink. In my own implementations I've used "uni-station" mode as an all-stations mode (i.e. In the current draft, this all-stations mode would be a shortcut for a selecting all stations with wildcards. Currently, submitting |
I was assuming the use of FETCH was quick connect, get all data ready and available, then disconnect, like for use over a high cost transmission line where connection time should be minimized. So how would a client that wants to recover the currently queued data, but not wait for additional packets use DATA? How would it know when the stream was finished and so know it was time to disconnect? Is this the main (only?) use case for FETCH? And, just question, but is this a use case seedlink4 must support? While "get all" might be reasonable if the remote system has limited storage, it could be maybe a bit dangerous connecting to a data center with large amount of data. Could a datacenter choose not to support no arg fetch? |
My premise: the point and criteria at which a server decides the client has it "all" is somewhat arbitrary and often is no longer true moments later. In general the server does not have special knowledge of the data flow, only what is it it's buffers, which is not all that special and may change a moment later. A client can detect when the data streams being received are a) within some tolerance of "now" and b) when data has stopped flowing for X seconds, and send
A data center can choose to limit client access based on all sorts of criteria including "too much" data; that must be an option for a server to protect itself. Note that the entire feed from the EarthScope export server is currently ~1.8 MB/second of 28,000+ streams. Of course it will grow as data rates increase, but right now that's less than a 4K video stream. |
Not sure if saving few bytes is worth of making the logic more complex. You aren't typing those commands by hand.
Grabbing what is available rather than waiting for complete data. Note that FETCH can also be used with time windows. It could be used by an early warning application or a cron job that updates plots every 30 minutes. You don't want to get stuck waiting for data if a station is not sending.
Yes, that is kind of "emulating" FETCH using DATA. I don't think it is a very clean solution. |
I can see that |
I am not sure I see the use case for Perhaps making the commands more explicit would avoid confusion. I don't think we really need to save a couple of bytes when we can design the protocol to avoid magic numbers. It seems like the -1 in DATA with time is just a placeholder. Perhaps all cases can be covered by:
Maybe servers that do not want to allow for the ALL variants can respond with ERROR? Does it make sense to allow |
I have vague memory of decision that uni-station mode was not going to be part of seedlink4. I agree that saving bytes isn't worth it here as |
To start cyclic transmission from current data, but not wait for next packet.
No. In order to not break that, I suggested a change that
It is left open forever, because it is difficult to detect when endtime is reached. Eg., should it wait for very low sample rate streams?
|
...and if there are no packets in the queue and the station has been destroyed and will never send more data, but neither the server nor the client know that? Does it wait forever, or send END without sending a packet? Seems like if a client is starting from scratch, sending an INFO to find out the state of the queue makes more sense then trying to FETCH a single packet? |
If the station was working before and it is configured in the server, there may be some data in the buffer. If the buffer is totally empty, it would send END without sending a packet. I don't know how far the proposal is, but if it was already reviewed/accepted or something like that, we should probably not make fundamental changes. |
The technical review stage has just begun, this is the time to discuss and consider final changes before a recommendation to approve or not is generated. |
No problem and understandable. The logic is not more complex in my own implementations so it will remain as an extension due to it's high convenience and consistent behavior. |
This description of behavior for an empty buffer sounds like what i suggested in #9 (comment) That seems the most sensible to me.
|
My vote is remove If the vote is to retain I guess am ok with |
I don't see why a client should use -1 repeatedly. If it uses seq 55 repeatedly it also gets the same packet multiple times. I'll try to stay out of this discussion as it leads to nowhere like many times before. If the proposal gets approved, I can implement everything that I need as a workaround or extension anyway. |
This discussion has fleshed out a lot of what the two commands are expected to do. However, I'm still not sure what the difference is between a DATA request with a given end-time and the same request using FETCH. Maybe it's just a matter of expanding on this in the documentation. I'm pretty sure the use case I was thinking of could be handled with a FETCH with a very wide time window, or doing an INFO request to extract the sequence numbers prior to a request. |
Thinking on this, I now read it as future transmission of data (DATA), rather than existing data (FETCH). |
I agree, it's not clean. But perhaps it's sufficient for the use cases that FETCH supports. Do we know how often FETCH is used? My impression is that is use is very rare, if at all, but that may be wrong. |
I believe with FETCH the server will close the connection when it has sent all the data it can to fulfill the request, whereas with DATA it will wait for the time window to be completed. How a given server determines that the time window is completed is a gray area. |
Important question, if FETCH is not commonly used, or if the is not a compelling use case for it, then only having DATA makes sense to me. The only use case I can come up with where FETCH is needed is pulling packets from a station over a satellite link, but I have no idea if anyone actually uses seedlink for this. Guess the answer is no. |
Feedback from proposal teamDefine special sequence number -2 (start of buffer) in addition to previously defined -1 (end of buffer). “FETCH -1” MAY return data if next packets arrive within a certain small time period. “FETCH -2” returns all data that is available in the server for requested station(s)¹ Discussion¹ What should be the role of -2 when time window is used? |
I still feel that using negative numbers to be special cases is confusing, a source of bugs and has no advantages I can see. If that all/latest functionality is needed, just make a separate command for each, like Consider a client that has packet number 8 and decides to fetch the 10 previous packets:
and ends up getting all the data that the server has. |
From the specification and discussion in this issue thread, I've summarized the use cases I see and approaches that can be taken to satisfy them:
The possible uses of FETCH to satisfy the use case scenarios above are all done without the use of magic numbers, since it is ideal to avoid these in protocol specifications. Further to that end, if there are additional use cases involving the DATA command that actually require start/end time parameters then it would be ideal to consider making a small protocol syntax adjustment to the specification to eliminate the potential ambiguity between sequence number and start time parameters of the DATA command. @chad-earthscope , from his experience working in a data management center environment providing a SeedLink interface, has indicated earlier in this issue thread his sense that the use of FETCH is rare. From my experience working for a company whose products include dataloggers, when SeedLink is used for data acquisition over a network, real-time mode is used exclusively on those products; on-demand data retrieval requests are instead satisfied without SeedLink e.g. using the FDSN dataselect web service. The cyclic data transmission use case above is not as cleanly satisfied with FETCH, but it may be a moot point. Overall, it would be good to know if others have awareness to the contrary such that SeedLink FETCH is actively used in practice in favour over other historical data retrieval methods, to determine whether it's worth keeping FETCH or instead dropping it to simplify server implementations. |
Issue #17 if resolved as proposed may effect this, possibly rendering it moot. |
Would it be useful to have a mechanism to say FETCH all the data from a server in one hit?
(e.g. without having to use an INFO streams, do the parsing and then make a request).
The text was updated successfully, but these errors were encountered: