Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[transducers] Transducers crash when source is an empty string #186

Closed
earthlyreason opened this issue Dec 12, 2019 · 14 comments · Fixed by #223
Closed

[transducers] Transducers crash when source is an empty string #186

earthlyreason opened this issue Dec 12, 2019 · 14 comments · Fixed by #223

Comments

@earthlyreason
Copy link
Contributor

Most transducers take a source sequence as a final argument. Since strings are iterable, so you can do this:

[...tx.filter(x => true, "abc")]
"abc"

However, if you provide an empty string, they will crash:

[...tx.filter(x => true, "")]
TypeError: tx.filter(...) is not iterable

This is obviously because the transducer functions do a truthiness test on their final argument to determine whether a source was provided:

export function map<A, B>(fn: Fn<A, B>): Transducer<A, B>;
export function map<A, B>(fn: Fn<A, B>, src: Iterable<A>): IterableIterator<B>;
export function map<A, B>(fn: Fn<A, B>, src?: Iterable<A>): any {
    return src
        ? iterator1(map(fn), src)
        : (rfn: Reducer<any, B>) => {
              const r = rfn[2];
              return compR(rfn, (acc, x: A) => r(acc, fn(x)));
          };
}

This could be corrected by replacing src with isIterable(src) (using @thi.ng/checks). This would incur the additional work of looking up Symbol.iterator, but it would seem to be more correct. I can't think of a case where this would be a breaking change (i.e. values for src that work now but are not iterable by that test).

@postspectacular I'm willing to PR if you think this is actionable. I understand there's a lot of branch work going on, so whatever you think best.

A workaround for the time being is to wrap the string argument in [...s].

@earthlyreason
Copy link
Contributor Author

I can't think of a case where this would be a breaking change (i.e. values for src that work now but are not iterable by that test).

Technically that assumes ES2015. In ES5 an array is truthy but doesn't have a function at Symbol.iterator.

Indeed, a truly un-shimmed ES5 environment would crash just by calling isIterable, since it assumes the existence of Symbol itself.

I don't know whether people are using @thi.ng in such environments.

@postspectacular
Copy link
Member

Hi @gavinpc-mindgrub - thank you for this! Strings (incl. empty ones) are indeed supposed to be fully supported as iterables and so the more proper check seems unavoidable. I'm not worried about ES5 compatibility since this is explicitly stated in the main readme that all packages are distributed as ES6 w/o any polyfills...

Btw. The alternatives I was originally considering to completely avoid the function overloads and checks for a final iterable arg was to provide additional functions like filterIter or rewrite the @thi.ng/iterators to just wrap/delegate to the transducer versions. I think that ship has sailed by now though, but keen to hear your thoughts...

@postspectacular
Copy link
Member

@gavinpc-mindgrub do you have bandwidth for a PR? No worries if not, just let me know pls...

@earthlyreason
Copy link
Contributor Author

@postspectacular, sure I'll send something this evening (I'm EST). Should this branch off of master?

On the design question, I have found the overloads convenient for sketching and "normal" usage. I am mostly working on metaprogramming, though, and in that context I have made my own wrappers anyway (to support lazy, serializable bindings representing a transducer description).

@postspectacular
Copy link
Member

@gavinpc-mindgrub great & thank you. There's no rush & please branch off develop...

I'd also say to stick with the overloads to avoid breakage of existing code and yet another major version change.

(Btw. Looking v.forward to see what you've been up to with this all... :) )

@earthlyreason
Copy link
Contributor Author

@postspectacular, roger that. I currently get the following test failures when running off of develop. Before I dig into these, are they currently expected?

The last one may be a legit timeout. I am working right now on a PC that is stupid slow.

  1) PubSub
       unsubTopic:

      Uncaught AssertionError [ERR_ASSERTION]: Expected values to be loosely deep-equal:

{
  a: [
    'a'
  ],
  b: []
}

should loosely deep-equal

{
  a: [
    'a'
  ],
  b: [
    'b'
  ]
}
      + expected - actual

       {
         "a": [
           "a"
         ]
      -  "b": []
      +  "b": [
      +    "b"
      +  ]
       }

      at Timeout._onTimeout (test\pubsub.ts:103:20)
      at listOnTimeout (internal/timers.js:531:17)
      at processTimers (internal/timers.js:475:7)

  2) StreamSync
       never closes:

      Uncaught AssertionError [ERR_ASSERTION]: Expected values to be loosely deep-equal:

[
  {
    a: 1,
    b: 1
  },
  {
    a: 2,
    b: 2
  }
]

should loosely deep-equal

[
  {
    a: 1,
    b: 1
  },
  {
    a: 2,
    b: 2
  },
  {
    a: 3,
    b: 3
  }
]
      + expected - actual

         {
           "a": 2
           "b": 2
         }
      +  {
      +    "a": 3
      +    "b": 3
      +  }
       ]

      at Timeout._onTimeout (test\stream-sync.ts:200:20)
      at listOnTimeout (internal/timers.js:531:17)
      at processTimers (internal/timers.js:475:7)

  3) Subscription
       unsub does not trigger Subscription.done():
     Error: Timeout of 100ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (D:\home\gavin\lib\my-umbrella\packages\rstream\test\subscription.ts)
      at Timeout._onTimeout (test\subscription.ts:54:13)
      at listOnTimeout (internal/timers.js:531:17)
      at processTimers (internal/timers.js:475:7)

@postspectacular
Copy link
Member

I've been getting them sporadically on travis (never locally) and assuming these are just issues with the timing related test setup and for whatever reason slow or super imprecise native timers in node. am 99.999% sure you can ignore these and/or rerun... also see travis build history

@earthlyreason
Copy link
Contributor Author

@postspectacular I have this more or less done.

Question about flatten, though. flatten already checks for iterable, but it has a special case for string.

export function flatten<T>(src?: Iterable<T | Iterable<T>>): any {
    return flattenWith(
        (x: any) =>
            x != null && x[Symbol.iterator] && typeof x !== "string"
                ? <any>x
                : undefined,
        src!
    );
}

So right now, the following tests pass:

        assert.deepEqual([...flatten(["", "a"])], ["", "a"]);
        assert.deepEqual([...flatten([[], ["a"], ""])], ["a", ""]);
         // (I added these)
        assert.deepEqual([...flatten([["abc"]])], ["abc"]);
        assert.deepEqual([...flatten(["abc"])], ["abc"]);
        assert.deepEqual([...flatten("abc")], ["a", "b", "c"]);
        assert.deepEqual([...flatten([""])], [""]);

flatten("") still crashes, but I'm not sure to handle that without changing existing behavior. The current behavior is somewhat inconsistent in that strings are treated as iterables at the top level but not otherwise:

> [...tx.flatten("abc")]
[ 'a', 'b', 'c' ]
> [...tx.flatten(123)]
Thrown:
TypeError: xs is not iterable
    at iterator (D:\home\gavin\research\the-better-thing\node_modules\@thi.ng\transducers\lib\index
.js:86:19)
    at iterator.next (<anonymous>)
> [...tx.flatten([123])]
[ 123 ]
> [...tx.flatten([[123]])]
[ 123 ]
> [...tx.flatten([[[123]]])]
[ 123 ]
> [...tx.flatten(["abc"])]
[ 'abc' ]
> [...tx.flatten([["abc"]])]
[ 'abc' ]
> [...tx.flatten([[["abc"]]])]
[ 'abc' ]

If this is correct:

> [...tx.flatten("abc")]
[ 'a', 'b', 'c' ]

Then flatten("") should produce an empty sequence. But currently flatten([""]) produces [""].

I don't want to add a special case just for "" if avoidable. What do you recommend?

@postspectacular
Copy link
Member

Thank you for the in-depth analysis, @gavinpc-mindgrub! In this case I'd opt (like I guess you do too) for consistency, even if it means breaking the current behavior for top-level strings (incl. empty ones).

So I'd say we treat top-level strings as atoms (unflattenable, just like nested strings) and update the body of flattenWith to something like:

return isIterable(src)
  ? iterator(flattenWith(fn), isString(src) ? <any>[src] : src)
  : (rfn: Reducer<any, T>) => { ... }

With that change we get:

> [...tx.flatten("")]
[ '' ]
> [...tx.flatten([""])]
[ '' ]

> [...tx.flatten("abc")]
[ 'abc' ]
> [...tx.flatten(["abc"])]
[ 'abc' ]

@earthlyreason
Copy link
Contributor Author

@postspectacular I haven't forgotten about this, just a bit caught up in holiday/family at the moment. Holdup was mostly learning my way around to include proper tests. I did come across a question, though.

I wasn't quite sure exactly how flattenWith was intended to work, so I was going to document it in the process. However, I noticed that there is documentation for a (different) flattenWith in the iterators package, which had been added recently, but not as JSDoc.

  • I understood there was a documentation overhaul going on, but I was under the impression it would use JSDoc as the source of truth. Is that incorrect?

  • I'd be happy to document the transducer version of flattenWith along the way. (Again, I'd prefer it, for my own edification.) Would JSDoc be the proper way to go, or should I skip this altogether for now?

Thanks, & best wishes for 2020!

@postspectacular
Copy link
Member

Hi @gavinpc-mindgrub - happy new year to you!! 🎉

I thought we cleared up how the behavior of flattenWith was supposed to work in our last 2 comments here and am not quite sure which newer docs in the iterators package you're referring to (git blame shows I edited it last 2 years ago). With the new top-level string behavior discussed above, there will be some discrepancy between both packages, but that's fine since I was going to deprecate the iterators package at some time in the near future anyway.

Also I'd say don't bother too much w/ docs for this PR, they can always be added/edited later on.

Thank you for doing this all! 👍

@postspectacular
Copy link
Member

Hi @gavinpc-mindgrub - I was going to prepare a new release in the next few days and was wondering if you have an ETA of your transducer updates and/or if you still have the time to work on this at all... no shame in saying no or needing more time, I just would like to synchronize, if poss. Else we can also just do another one once you're ready... Thanks!

@earthlyreason
Copy link
Contributor Author

@postspectacular gotcha. I'll shoot for this evening, but if it doesn't come through then we can kick to the next round.

@postspectacular
Copy link
Member

@gavinpc-mindgrub no pressure pls! take your time!

earthlyreason added a commit to earthlyreason/umbrella that referenced this issue Jun 1, 2020
postspectacular added a commit that referenced this issue Jun 1, 2020
fix(transducers): #186, Fix crash when using empty string as source for several transducers
postspectacular added a commit that referenced this issue Jun 1, 2020
* 'develop' of github.com:thi-ng/umbrella:
  fix(transducers): #186, Fix crash when using empty string as source for several transducers.
postspectacular added a commit that referenced this issue Jun 1, 2020
* develop:
  Publish
  docs(transducers): update readme
  refactor(transducers): update flatten & flattenWith
  test(transducers): minor update flatten tests
  fix(transducers): #186, Fix crash when using empty string as source for several transducers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants