New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make location handling less obtrusive #2
Comments
This looks a little bit brittle to use. Not sure you are going to like my answer. What I would personally do is not provide a stream abstraction/interface at the lowest level of your library. Streams of signals should be built on top. One of the reason is that the streaming landscape in OCaml is a little bit unsatisfying and fragmented at the moment and the possible arrival of effects in the future may change things again (e.g. among others there is sequence, gen, core's sequence and we have been discussing privately on an alternative with @pqwy). I don't think that it's the task of your library to solve this problem. The value I personally see in your library is a non-blocking, error-correcting, incremental parsing of the insane HTML5 specification. I don't really care about its streaming abstraction and combinators, the latter should be left to a generic streaming library. The separate function you propose looks like a closure acting on a hidden data structure, why not expose this data structure so that more operations can be performed on it ? This data structure, holding your parsing state, could simply be called a Once you have that lowest level abtraction you can provide various kind of lighter, quick and dirty api's to the end-user (e.g. only streams with no positions) without caring too much since you know that if you don't provide the right thing the user can easily dig one level down to access the data and form the stream it needs with the libraries it uses. You should make a base API that allows maximal reuse of the value of your library which is: standard compliant, non-blocking and incremental html5/xml parsers. |
( |
Well, there are several matters here. CPS streams are the way in which the library is internally composed, i.e. this abstraction is not merely introduced by the interface. All the interface does is hide their CPS nature. Of course, there are closures with mutable state involved, and I have considered exposing some of them as state types. However, I haven't yet had a strong reason to do so, and it would complicate the interface – so far, in my opinion, unnecessarily. For this issue, I was thinking of something like a parameter Whether to provide an API like I don't believe streams (or I/O) belong in Markup.ml. My plan is to eventually factor these out into separate libraries. I'm also likely to expose the CPS interface at some point. Currently the closest way to it is the monadic I/O adapter functor The main reason I did not use an existing sequence library is poor support for programming that is both agnostic to the question of synchronous vs. asynchronous usage and supports a straightforward concept of composition. I may be wrong about this, of course, but that is the impression I got. I am very interested in this, and would be glad to join whatever discussions on sequence types that are ongoing. Besides Markup.ml, Lambda Soup and a few other projects I am considering also need good sequence types. |
I don't want to restart the iterator debate here, but we really need a place to do so. I fail to see why we need one or two additional types, and how they would improve on what currently exists (as cited above, Core.Sequence, gen, sequence, various lazy lists), even assuming effects are coming into OCaml (they would improve gen too, for instance). |
Can someone provide a link to effects? I vote in favor of debating sequences somewhere. This really belongs in a well-known (but tightly focused) library or even the standard library. |
A bit dated, but http://kcsrk.info/ocaml/multicore/2015/05/20/effects-multicore/ was the original post, I believe. |
Le jeudi, 14 janvier 2016 à 20:06, Anton a écrit :
Le jeudi, 14 janvier 2016 à 20:15, Simon Cruanes a écrit :
Well if you fail to see you didn't follow the discussion then. So to restate it, in my opinion none of these solutions have a compelling story for handling errors occuring at either the producer or consumer end, something that is important for my use cases. Best, Daniel |
Daniel, thanks for your comments. I took a "middle" approach for the time being. There is value in having a parser/decoder type for querying in the future. I don't think I will move to a full jsonm-like interface (i.e. ``Await`) for now, mainly because it would require non-trivial internal adaptation, and I don't yet see a benefit to usage. I will be able to make a more informed decision when I am working on performance. In the meantime, the current interface and a jsonm-like interface are both implementable in terms of each other, so this should not be a big problem. I can always make the basic interface of today be the high-level wrapper for tomorrow, or even just have a big breaking change before 1.0. |
The parsers currently emit streams of
location * signal
, which causes a problem – for each processing function, should it takelocation * signal
streams orsignal
streams? This forces the user to think about when to calldrop_locations
.I will probably change the parsers to return
signal
streams and a separate functionunit -> location
which will evaluate to the last location emitted. The user will have to be warned that peeking thesignal
stream will result in inaccurate location corresponding to the next call tonext
.@dbuenzli would you happen to have any recommendations on this?
The text was updated successfully, but these errors were encountered: