Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic Test Selectors #327

Open
wants to merge 12 commits into
base: master
from

Conversation

Projects
None yet
@jgwhite
Copy link

commented Apr 19, 2018

Rendered

My intention is to follow this RFC with a second, further exploring the ideas from the talk (watch this space).

@kellyselden

This comment has been minimized.

Copy link
Member

commented Apr 19, 2018

That was a good read.

When you say "perceivable", does that mean CSS visibility of parent elements is taken into consideration? For example, an accordian (that hides content instead of destroying) with a "Save" button in each. Since the end user only "perceives" one save button at a time, do I need to take any steps to differentiate the buttons for testing?

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

When you say "perceivable", does that mean CSS visibility of parent elements is taken into consideration?

@kellyselden yes, we’d want it to behave exactly the same way the accessibility tree (and by extension the screen reader) does. It’s a very important point, and we should add it to the RFC.

@CodingItWrong

This comment has been minimized.

Copy link

commented Apr 19, 2018

This is one of the two approaches to selecting test elements that I currently hear about:

  1. Perceivable text, as in Capybara, react-testing-library, and this RFC.
  2. Test-specific selectors, as in ember-test-selectors, Detox, and Kent's past writing.

Adding perceivable text as an easy built-in option seems like a no-downside. But before moving to it as a default, it would probably be good to weigh the pros and cons of perceivable text vs. test-specific selectors in a variety of ecosystems. Kent could probably provide good rationale on why he's moving to prefer perceivable text. It could also be useful to get a survey of experienced Rails devs to see if many of them experience the downsides of selecting by perceivable text.

Personally I've used both approaches, but I've mostly worked on projects small enough that either has worked fine. Folks in larger/older codebases could probably provide better perspective.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

…before moving to [perceivable text] as a default, it would probably be good to weigh the pros and cons of perceivable text vs. test-specific selectors in a variety of ecosystems. Kent could probably provide good rationale on why he's moving to prefer perceivable text. It could also be useful to get a survey of experienced Rails devs to see if many of them experience the downsides of selecting by perceivable text.

@CodingItWrong I completely agree and hope that the comment thread on this RFC will yield some case studies to get us started. Anecdotally, my own experiences using perceivable text via Capybara were always very pleasant, but then again I wasn’t interacting with complex custom components of the type found in modern web apps.

@CodingItWrong

This comment has been minimized.

Copy link

commented Apr 19, 2018

@jgwhite Same, I have a few years of heavy Rails experience and only brief frontend JS app experience so far.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

@kentcdodds just to let you know, this RFC is inspired by (and cites) your work on react-testing-library and dom-testing-library. I’m pretty convinced we can build atop dom-testing-library (if the community is in favour of the proposal of course). Thanks for blazing the trail!

@jgwhite jgwhite force-pushed the jgwhite:semantic-test-selectors branch from 9926b17 to 975240d Apr 19, 2018

@cibernox

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

I think this is a good idea, and given that current helpers already accept dom elements as well as css selectors, implementation is pretty easy.

P.e., a semantic fillIn could be:

// fill-in.js
import { fillIn as rawFillIn } from '@ember/test-helpers';

export async function fillIn(label, text) {
  let inputElement = findInputLabeled(label); // throws error if not found
  return rawFillIn(inputElement, text);
}

I like it. I'm not 100% convinced it should be part of the default blueprint, tho.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

@cibernox agreed, implementation should be very achievable, and the fidelity of “semantic lookup” can improve over time.

I’m not 100% convinced it should be part of the default blueprint

Could you say more?

To clarify: I’m advocating for default status in the far future, after this has had a chance to cook in the community cauldron.

@kellyselden

This comment has been minimized.

Copy link
Member

commented Apr 19, 2018

To clarify: I’m advocating for default status in the far future, after this has had a chance to cook in the community cauldron.

@jgwhite Perhaps it may be too early for an RFC then, as it hasn't spent enough time "cooking" and refining? Just a thought.

@cibernox

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

@jgwhite What I don't like a lot is having two fillIn or click methods, and I don't think we can only rely on semantic selectors because there there will be exceptions or developers that just won't care about semantics.

I am thinking that perhaps after some experimentation, we could unify them.

// click.js in @ember/test-helpers

export default async function(selectorOrText) {
  if (selectorOrText instanceof HTMLElement || isSelector(selectorOrText)){
    // current behaviour
  } else {
    // semantic behaviour
  }
}

That way we wouldn't have to choose between semantic vs css-based when importing the helpers.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

Perhaps it may be too early for an RFC then, as it hasn't spent enough time "cooking" and refining? Just a thought.

Yes and no. I strongly believe this should be part of the mental model for testing in Ember, and RFCs are the best way to plant that seed. While I would love to have a published implementation of `ember-semantic-test-helpers‘ to go along with this, I think getting the community aligned on the principles is a more significant step.

Plus, I’m far too lazy to create and maintain an addon 😄

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

@cibernox yep that’s an interesting alternative. I’ll add it to the relevant section.

I considered it when preparing the talk, but concluded that if it was feasible to implement isSelector reliably then Capybara would already have done it and so better to keep things explicit to begin with.

Definitely interesting territory to explore.

@JoaoGFarias

This comment has been minimized.

Copy link

commented Apr 19, 2018

I am not aware of the position of Ember.js approach to page objects, but there is an addon which implements this pattern more "traditionally", however very similarly to the proposed approach.

http://ember-cli-page-object.js.org/docs/v1.14.x/

Maybe this can be useful on the RFC refinement.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

@JoaoGFarias thanks for bringing page objects into the conversation. I think page objects and the semantic approach can complement each other. I know the page objects team have been thinking along similar lines.

@Gaurav0

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

Can we target by the key we are using for i18n rather than the ultimate value?

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

Can we target by the key we are using for i18n?

@Gaurav0 I think that would be a really cool enhancement. It is, in a sense, another facet of creating accessible/inclusive apps. Fuel for a future RFC perhaps?

@cibernox

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

@cibernox yep that’s an interesting alternative. I’ll add it to the relevant section.

I considered it when preparing the talk, but concluded that if it was feasible to implement isSelector reliably then Capybara would already have done it and so better to keep things explicit to begin with.

Definitely interesting territory to explore.

Interesting. I'd say that anything that does not start with #, . or [] can't be a css selector. I suppose there is a couple edge cases in which you want to click on a link whose text happens to start with # (like a hashtag) but we could figure something out.

@kellyselden

This comment has been minimized.

Copy link
Member

commented Apr 19, 2018

what about input:first-child or the like

@cibernox

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

what about input:first-child or the like

touché, and with custom element the task is essentially impossible. I take it back then.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

touché, and with custom element the task is essentially impossible. I take it back then.

You never know, it might be tractable, and it would be very cool.

@simonihmig

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

I do find this approach interesting, though certainly it needs some thorough discussion and maybe some early implementation to play with and experiment...

I'm a bit concerned about the brittleness you already addressed. We generally use i18n to centralize all labels even for apps that are not (yet) multi-languange, so non-devs can also easily contribute wording changes. Would certainly confuse them a lot when a simple change would cause their PR to go red...

Regarding the non-unique labels issue, I would think about introducing another concept to disambiguate, which is "context". Technically this would just be scoped selectors instead of the global ones, so you can distinguish say the email input of the sign-in form from the same one within the sign-up form, even if their labels, title etc. are exactly the same.

But that's not only a technical workaround, I think it could play just nicely with the overarching concept of user perception, as users do perceive context. For visual perception, things like positioning, spacing etc. convey meaning, they connect or separate different elements semantically. But I guess screen readers will also provide that context (given accessible markup), so users will know that the currently focused input is within the sign-up form for example.

Example:

await fillIn('Password', 'topsecret', 'Sign up');

The third param could be optional, but if present will add a scope to the selector. In this case this could match a password field that is inside a form that has a title of "Sign up", or a <fieldset> with a <caption> or something like that.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

Regarding the non-unique labels issue, I would think about introducing another concept to disambiguate, which is "context".

@simonihmig great observation. Capybara has the within helper for just this purpose:

within('div#delivery-address') do
  fill_in('Street', with: '12 Main Street')
end
@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

I'm a bit concerned about the brittleness you already addressed. We generally use i18n to centralize all labels even for apps that are not (yet) multi-languange, so non-devs can also easily contribute wording changes. Would certainly confuse them a lot when a simple change would cause their PR to go red...

@simonihmig this is my main concern too. Then again, maybe introducing extra rigour into copy changes is desirable?

Or perhaps @Gaurav0’s idea of i18n-key-as-selector. I must say, in every codebase I’ve worked on the internationalised strings were chaotically organised so figuring out the correct key always involved trial-and-error. I suspect something that enforced a set structure would be welcome.

@chilicoder

This comment has been minimized.

Copy link

commented Apr 19, 2018

Excuse me, but we already have at least two brilliant libraries for semantic tests -- ember-page-objects and ember-test-selectors
the first one will give

indexPage.fillInLogin('alice@example.com');

the second

await fillIn('[data-test-login]', 'alice@example.com');

Both are very semantic

@Gaurav0

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2018

@chilicoder Those libraries do not solve the problem this RFC is trying to solve: they do not enforce a11y.

Perhaps the title should of the RFC should include the word accessibility?

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 19, 2018

Excuse me, but we already have at least two brilliant libraries for semantic tests -- ember-page-objects and ember-test-selectors

@chilicoder indeed. Both approaches are wonderful but neither require that the elements in question have perceivable descriptive labels. That is the main motivation of this RFC
— to put accessibility front-and-centre when programmatically describing our UIs.

@CodingItWrong

This comment has been minimized.

Copy link

commented Apr 19, 2018

@jgwhite Are the other two motivations listed in the RFC equal in your mind to accessibility? I probably don't prioritize accessibility as highly as I ought. For me, the "Cleaner Tests" motivation is more motivating, so I think that's worth keeping in the conversation.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Apr 20, 2018

Are the other two motivations listed in the RFC equal in your mind to accessibility?

@CodingItWrong I believe all three together make the strongest case. Someone could successfully argue that the accessibility motivation is satisfied by other means (linting, ember-a11y-testing) and the other two motivations would still provide reason to consider the proposal.

I think the only real innovation of this RFC is to marry accessibility with test ergonomics. Turns out they go together really well!

@mehulkar

This comment has been minimized.

Copy link

commented May 10, 2018

It’s worth noting that it’s not impossible to stick to data selectors and then using test helpers to throw warnings on inaccessible elements. The rules for what makes something accessible are more easily captured deeper in the tooling than to assume that everyone will write tests the same way.

@EndangeredMassa EndangeredMassa referenced this pull request May 21, 2018

Merged

[Blog] article on a11y in Ember and addons #3349

6 of 7 tasks complete
@knownasilya

This comment has been minimized.

Copy link
Contributor

commented Jun 7, 2018

Is https://github.com/tradegecko/ember-semantic-test-helpers the testing ground for this RFC? Is there anything preventing this from going into FCP?

@billybonks

This comment has been minimized.

Copy link

commented Jun 12, 2018

@knownasilya that it is, though there are a few differences between it, not sure if those should be applied to the RFC, and go through another round of review or we just use the conversation here, to inspire the addon as we build, and move conversations to the repo?

i am currently working on stable release should be ready pretty soon.

@billybonks

This comment has been minimized.

Copy link

commented Jul 2, 2018

I'm going to go through all the ideas in comments and take ideas from initial implementation, apply changes to RFC so that we can go to fcp

@billybonks

This comment has been minimized.

Copy link

commented Jul 8, 2018

after going through all the comments, and applying lessons learnt these, i want to modify the rfc according to these ideas, and any comments made on them for FCP.

https://gist.github.com/billybonks/cb494e064226fc9e9c442b7bd32b0a81

@lolmaus

This comment has been minimized.

Copy link

commented Sep 12, 2018

Hi!

We're using Cucumber in our app (via ember-cli-yadda) and currently we have hundreds and thousands of very similar steps like this:

When user clicks foo
When user clicks bar
When user clicks baz

Each of those thousands steps has a unique implementation:

{
  "when user clicks foo"() {
    return pageObject.foo.click();
  },
  "when user clicks bar"() {
    return pageObject.bar.click();
  },
  "when user clicks baz"() {
    return pageObject.baz.click();
  },
}

As you can see, these are very repetitive and rely on page objects heavily.

By following this RFC, 80% of our step implementations can be reduced to a few universal ones:

{
  "when user clicks `$label`"(label) {
    return clickByLabel(label);
  },
}

This has the potential of both improving accessibility AND drastically simplifying our test codebase! 🎉 👯 🍾

One thing I'm concerned though is nested invocations. We won't be able to guarantee uniqueness of labels because:

  • no single developer has a grasp of the whole application;
  • some components are meant to be invoked multiple times on the same page.

For example, this is gonna be a very typical case:

When user clicks on `Expand` of `Activity item 5` under `Project activity group Tuesday, March 24` 

This RFC needs details on:

  • How to handle nested invocations like this. We probably need chains like:
    find("Project activity group Tuesday, March 24").find("Activity item")[5].click();
  • How to handle lists. Should the index of an item be hardcoded into the label text? Or should find() return an array?

PS Is somebody already using this approach via custom implementation or addon?

@BryanCrotaz

This comment has been minimized.

Copy link

commented Sep 12, 2018

Is there a reason that you can’t do

"when user clicks $string"(selector) {
return pageObject[selector].click();
},

I have a similar solution where I’ve written a regex that matches css selectors and I have a generic step for “when I click on $selector” using find.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 12, 2018

@BryanCrotaz Yes, several, actualy:

  • this will make Cucumber user stories ugly for developers and unreadable for non-developers;
  • it fails to follow the "say more with a less powerful language" doctrine and has no a11y benefit;
  • it forces you to maintain useless page objects.

I find page objects a powerful tool, but 80% of them is just clicking, filling and reading text on selectors. Why maintain an army of them, if human readable, a11y-empowering labels can be used directly in step names?

@BryanCrotaz

This comment has been minimized.

Copy link

commented Sep 12, 2018

@lolmaus here's how I did it:

// steps.js
const dictionary = new yadda.Dictionary()
    .define('coords', /\((\d+(?:.\d+)?),\s*(\d+(?:.\d+)?)\)/)
    .define('number', /(\d+(?:\.\d+)?)/)
    .define('selector', /['"]?([a-zA-Z\d-_. ]+)['"]?/);

   .when('I click on $selector', async function(selector) {
      const target = find(selector);
      assert.ok(target, `${selector} should be present`);
      await click(target);
      assert.ok(true, this.step);
      await settled();
    })

    .when('I type $string into $selector', async function(text, selector) {
      const target = find(selector);
      assert.ok(target, `${selector} should be present`);
      await fillIn(target, text);
      assert.ok(true, this.step);
    })

 .then('dom $selector exists', async function(selector) {
      await settled();
      const element = find(selector);
      assert.ok(element, `element ${selector} should exist`);
    })

    .then('dom $selector does not exist', async function(selector) {
      await settled();
      const element = find(selector);
      assert.notOk(element, `element ${selector} should not exist`);
    })

// etc

 Scenario: cut paste an element with Cmd+X Cmd+V

    Given a project
    And a template-editor component
    When I click on button.add.text
    And I press Command+X in .template-editor
    And I press Command+V in .template-editor
    Then selection is [1]
    And there is 1 ".template-player div.text" dom node

Implementing this RFC will be great, but in the mean time this technique would get rid of your thousands of custom steps.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 12, 2018

Thx for sharing! Got a couple of questions:

  1. Is this a real-world or synthetic example?
  2. Is it an integration test? Can't see Given a template-editor component step implementation.

Anyway, this is not relevant to the RFC; and in real-life scenarios steps like click on button.add.text will be too technical and (IMO) nullify one of main Cucumber benefits.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 12, 2018

My question from above still stands: how will this RFC deal with nested invocations and arrays?

For example, I want my test to click here:

image

How do I do that with this RFC? It mentions neither find() nor method chains.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 12, 2018

@lolmaus thanks so much for your feedback.

Regarding nested items and items in lists…

If I recall correctly Capybara has a within helper:

within ".some-selector" do
  click "Some button"
end

Personally I always found this approach very readable.

Not quite sure how you marry it with Cucumber/Gherkin but that’s potentially out of scope for this discussion.

However, coming at this from the a11y perspective, I think we can nudge the conversation in a slightly different direction. Let’s look at your (perfect) example:

Screenshot of comment thread pointing to like button

In a test following this RFC I would like to be able to write:

await click('React with a heart to billybonks’ comment “I’m going to go through all the ideas…”');

And this string would actual be present in title or aria-label of the element:

<button aria-label="React with a heart to billybonks’ comment “I’m going to go through all the ideas…”">❤️</button>

Adding this specificity is beneficial for testing and for accessibility too. Particularly with screenreaders that feature something like VoiceOver’s “rotor” which will present all links on a page, in order but out of context, for rapid navigation/interaction.

The above is covered in the non-unique labels section of the RFC.

@BryanCrotaz

This comment has been minimized.

Copy link

commented Sep 12, 2018

@lolmaus

  1. Is this a real-world or synthetic example?

Real world

  1. Is it an integration test? Can't see Given a template-editor component step implementation.

Yes. I've got lots of steps, just pulled out the simpler ones. It's an html editor, so it has drag, resize, click at coords steps and various others.

Anyway, this is not relevant to the RFC; and in real-life scenarios steps like click on button.add.text will be too technical and (IMO) nullify one of main Cucumber benefits.

Agree it's off-topic, just suggesting a simplification of your thousands of steps in the mean time.

// as you ask
 .given('a $string component(?:\\s*with)?$options', async function(componentName, options) {
      await this.render(compileTemplate(`{{${componentName} ${options}}}`));
      assert.ok(true, this.step);
    })
@lolmaus

This comment has been minimized.

Copy link

commented Sep 12, 2018

@jgwhite Thanks for your answer!

Before I comment on it, let me explain where I stand.

I'm currently introducing BDD & Cucumber to my project and my team. There's a pending debate on pros vs cons, and not everyone on the team recognizes the virtues of BDD. Also, not everyone agrees that the complexity increase that comes with Cucumber is worth it.

So my goal is to maximize the benefit of Cucumber while minimizing its impact on complexity.

When I listened to your Say More™ talk, I was amazed with the idea of semantic selectors. By using semantic selectors directly in user stories, I can achieve multiple benefits at once:

  • user stories are more readable,
  • thousands of very similar step implementations get replaced with a dozen of generic ones,
  • steps no longer require to be backed by thousands of page object entries,
  • user stroies are more truthy: they reference DOM nodes directly, so neither repetitive step implementations no page objects have the potential of concealing bugs (false positives).

I'm extremely excited about that.

The a11y-empowering potential of this RFC is truly amazing, but I would like to emphasize that semantic test selectors aren't only about a11y.

My goal here is to defend the needs of usual testing practices. I want semantic test selectors in Ember to be full-featured and covering all the testing needs, without having to revert to old-school selectors.

I could come up with my own semantic selectos addon that would perfectly satisfy all the testing needs (and I'm very excited to try). But the result will not be a11y-aware because currently I know nothing about a11y. 👎

This RFC has the potential of killing two exceptionally prized birds with one stone: BDD and a11y. a11y is a very noble goal, but as we pursue it we must make sure not to sacrifice normal testing needs.

Let's make sure that both goals are fully satisfied. I believe we have a good chance to succeed, let's not consider reverting to old-school selectors an offical way to fill the normal testing flaws of this RFC. A necessity to combine both approaches will be an indication that this RFC has failed.


Now back to your suggested solution! Though it seems nice from the a11y perspective, unfortunately, it has a whole lot of flaws from the test-coding perspective:

  1. App state for acceptance tests is often seeded with random data. Here's a typical step for example:

    Given a PR with 5 comments

    In the testexample, I want to be able to target each of those comments.

    With your approach, the seeding step would blow up to:

    Given a PR with id 1
    Given a comment with body `Foo` created by user Doe belonging to PR 1
    Given a comment with body `Bar` created by user Doe belonging to PR 1
    Given a comment with body `Baz` created by user Doe belonging to PR 1
    Given a comment with body `Quux` created by user Doe belonging to PR 1
    Given a comment with body `Zomg` created by user Doe belonging to PR 1

    And then you have to reuse those names in actions, instead of referencing comments by index.

    This makes coupling between steps much tighter, which in turn makes tests cases so much harder to write (no random seeding) and to validate (too many details to read and collate)!

  2. It is impossible to iterate through items because they don't share a semantic selector.

    With every label being unique, how do I semantically assert There should be 5 comments on the page?

  3. The deeper nesting goes the longer labels are.

    Imagine the Changes tab of a PR:

    • it has multiple files;
    • each file has dozens of lines;
    • each line may have an arbitrary number of comments;
    • each comment may have an arbitrary number of replies (not a case with GitHub, but in my product we do have threads);
    • each reply may have a few reaction emojis.

    In order for labels to stay unique, they'll have to be terribly long. Reading them aloud them will definitely not be "rapid".

  4. Even with enormously long labels, uniqueness is still not guaranteed. If the same author makes two comments that start with the same words (or are even entirely identical, e. g. a Mind indentation comment can be repeated several times in the same file), then tests will crash, even though the developer did nothing wrong.

  5. Your suggestion is very custom-tailored to a specific case. Without general, universal rules for composing labels, it is impossible to guess how a label for each element is gonna look like.

    They will all be very arbitrary, each developer will end up using their personal style, and the result will be a mess.

    This RFC does need specific guidelines for composing labels.

  6. While aria-label is more or less clear to me, I (having no a11y experience so far) am very frustrated with aria-labelledby.

    According to MDN, human-readable labels can't be used in aria-labelledby. Only HTML ids can! MDN provides this example:

    <div id="billing">Billing</div>
    
    <div>
        <div id="name">Name</div>
        <input type="text" aria-labelledby="billing name"/>
    </div>
    <div>
        <div id="address">Address</div>
        <input type="text" aria-labelledby="billing address"/>
    </div>

    The problem is that ids must be unique, but we deal with multiple instances of deeply nested components. We are building apps, but the spec makes an impression of being designed only to cover articles whose HTML is written entirely by hand!

    I can see only one way to workaround the ids problem: components should accept a base id as an attribute (and an index, in case of lists), so that they can prefix their ids with the base id and index, then pass their ids as base ids to their children.

    Note that the necessity of passing machine-readable base ids does not lift the necessity of passing human readable labels, which must also be composed with base ids from their parents. Just thinking about doing both makes me feel dizzy. 😵

Again, I could propose a semantic testing strategy that has all of these issues resolved -- but it will not be a11y aware. So let's cooperate!

A Boss Baby reference (GIF animation)

Two rivals cooperate


Regardless of the above, this RFC requires explicit guidelines on how exactly to apply labels to HTML elements and Ember components: what attributes to use in which situations, etc.

The addon could provide some tools for that, or we could cooperate with ember-a11y-landmarks.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 13, 2018

@lolmaus

My goal here is to defend the needs of usual testing practices. I want semantic test selectors in Ember to be full-featured and covering all the testing needs, without having to revert to old-school selectors.

Beautifully put.

App state for acceptance tests is often seeded with random data

Very true. I should add this to the drawbacks section of the RFC. I will note that this can be a footgun for visual regression testing tools like Percy too.

With every label being unique, how do I semantically assert There should be 5 comments on the page?

This is a very interesting example. Let’s reframe it in human terms: if you were asking a screen reader user to count the number of comments on the page, how would you tell them to identify a comment?

There’s actually a really great answer for this: accessible HTML semantics. When a screen reader encounters a UL or OL or [role="list"] it announces the number of items in the list! The screen reader user can actually answer your question quicker than someone depending purely on what’s visually perceivable because the screen reader will say to them “Comments, list, 5 items”.

The HTML could be:

<ul aria-label="Comments">
  <li>…</li>
  <li>…</li>
  <li>…</li>
  <li>…</li>
  <li>…</li>
</ul>

Or:

<section aria-label="Comments" role="list">
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
</section>

Or:

<h2 id="comments-title">Comments</h2>

<section aria-labelledby="comments-title" role="list">
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
  <article role="listitem">…</article>
</section>

I think this is wonderful because it gives us a precise, prescriptive way to select for elements in our test, dervied from standard HTML+ARIA semantics.

I can imagine the Gherkin statement being:

Given a PR with 5 comments
When I visit the PR page
Then there should be a list labelled “Comments” containing 5 items

Your example was so provocative that I ended up writing a lot more than I intended. I’ll try to address the other points later on in the day (or perhaps someone else will address them (probably better than me!)).

@billybonks

This comment has been minimized.

Copy link

commented Sep 13, 2018

I think the way that i would have built this test with my team is:

within('I'm going to go through all the ideas in comments and take ideas from initial implementation, apply changes to RFC so that we can go to fcp',  function() {
 click('add heart emoji')
});

assert.list('comments').hasNItems(5);

the struggle i have is
click('add heart emoji')
vs
click('React with a heart to billybonks’ comment “I’m going to go through all the ideas…')

To understand which is the better approach would require user testing, because there may be ways, to build these labels and another way may have a better experience. in the latter it sounds like every button is going to have the extra information of the comment.

i think it's more preferable if the SR said:

"I’m going to go through all the ideas…"
"reactions"
"button happy emoji"
"button heart emoji"
"button select emoji"

but if the latter is the better experience i also have an idea that i want to implement soon being

click('/react with a heart/gi'). i have seen other use cases for this.

the within helper is already built, so is the assert.someSemanticThing

The problem is that ids must be unique, but we deal with multiple instances of deeply nested components. We are building apps, but the spec makes an impression of being designed only to cover articles whose HTML is written entirely by hand!

When i read power select code a while back, this was handled very well

{{comment-component}} <---- uses ember generated id for example ('ember 1234')

<div class="row">
  {{emoji type="heart" id=(concat elementId "-react-heart'}}
</div>

given this way of applying id's you rely on ember to use a unique prefix and you concatenate that prefix, this way you have confidence in having unique id's this can be quite verbose given certain use cases, but very powerful in others.

{{text}}

@lolmaus

This comment has been minimized.

Copy link

commented Sep 13, 2018

@jgwhite You essentially suggest that an item component has no label to semantically identify its type. We can only reference it in the context of its parent:

Then 5th item of the `Comments` list should say `I'm going to go through all ideas`

I would like to be able to say simply:

Then 5th `Comment` should say `I'm going to go through all ideas`

The difference may seem minor in this example. But as we pursue the goal of targeting a reaction emoji on the Changes tab, we impose a lot of complexity. Labels become enormously long. While this is a necessity for that particular case, it becomes a nuisance for simpler cases.

When there's a unique but deeply nested component, say, a subscribe button, I want to target it simply with its semantic name: Subscribe button or something like that.

But the methodology we're inventing here must be universal. It can't be hand-tailored to each individual case. Because components are invoked by components which are invoked by components.... They all must be relying on some default way of composing labels. And from the route template, there is no simple way to tell the innermost component to switch from the default strategy to another one.

@billybonks says:

there may be ways, to build these labels [optimally for testing needs] and another way may have a better [screen reader] experience

This means that we must come up with a strategy that all pages and components will follow. It must be flexible and powerful, it must satisfy all the usual testing needs and a11y needs at the same time.

Thus, I'd rather have each component have a short semantic label that corresponds to the component's name:

<section aria-label="Comments" role="list">
  <article role="listitem" aria-label="Comment">…</article>
  <article role="listitem" aria-label="Comment">…</article>
</section>

This gives us flexibility:

  • When the component is alone on the page, I can target it directly:

    When user clicks on `Subscribe button`
    {
      "when user clicks on `$label`" (label) {
        return clickByLabel(label);
      },
    }
  • When there are multiple instances of the component, I can do this:

    When user clicks on `Subscribe button` in 5th `Comment`
    {
      "user clicks `(.+?)` in (?:(\\d+)(?:st|nd|rd|th)) `(.+?)`" (childLabel, parentIndex, parentLabel) {
        const parents = findAllByLabel(parentLabel);
        const parent = parents[parentIndex];
        const child = findByLabel(childLabel, parent);
        return click(child);
      }
    }

Note that you can switch from one approach to the other anytime, without having to change the labelling scheme. This way, the code is also not prone to any of the issues I mentioned in my previous comment.

Note that element body remains available to screen readers.


@billybonks, your code sample is definitely inspired by Ruby. Unfortunately, JS does not work like that: in your code the click function will not be aware of the context provided by within.

There are several ways to work around this.

  1. Pass context as an argument:

    return within("I'm going to go",  function(context) {
      return click("add heart emoji", context);
    });
  2. Invoke click with within's context:

    return within("I'm going to go",  function() {
      return click.call(this, "add heart emoji");
    });
  3. Have within return a curried click:

    return within("I'm going to go",  function({click}) {
      return click("add heart emoji");
    });

(Note that return or async/await are necessary everywhere for the test not to finish prematurely.)

I believe all these variants are too funky and force you into deep nesting (in complex cases).

I'd rather go with a simpler approach:

const parent = findByLabel("I'm going to go");
return clickByLabel("add heart emoji", parent);

This scales well without nesting:

const file = findAllByLabel("File")[4];
const line = findAllByLabel("Line", file)[27];
const comment = findAllByLabel("Comment", line)[0];
return clickByLabel("Add heart emoji", comment);

With the upcoming pipe operator and partial application language features, intermediate variables could be avoided:

return findAllByLabel("File")
  |> objectAt(?, 4)
  |> findAllByLabel("Line", ?)
  |> objectAt(?, 27)
  |> findAllByLabel("Comment", ?)
  |> objectAt(?, 0)
  |> clickByLabel("Add heart emoji", ?)
@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 13, 2018

@lolmaus

Thus, I'd rather have each component have a short semantic label that corresponds to the component's name:

<section aria-label="Comments" role="list">
  <article role="listitem" aria-label="Comment">…</article>
  <article role="listitem" aria-label="Comment">…</article>
</section>

This seems great to me (assuming it works well for screen reader users). The high order bit, as far as I’m concerned, is that for something to meaningfully accessible in a test it should also be meaningfully accessible to a screenreader. What you’ve proposed above is a splendid example of this philosophy in practice.

@billybonks

This comment has been minimized.

Copy link

commented Sep 13, 2018

@billybonks, your code sample is definitely inspired by Ruby. Unfortunately, JS does not work like that: in your code the click function will not be aware of the context provided by within.

I am taking advantage of javascript being single threaded. The library has a singleton config class where you can set a root element
https://github.com/tradegecko/semantic-dom-selectors/blob/master/src/config.js#L25
by default it would be the application root element but when you use the within helper it changes that root until the block resolves

import { config } from 'semantic-dom-selectors';

export default async function within(selector, func) {
  let newRoot = document.querySelector(selector);
  let oldRoot = config.rootElement;
  try {
    config.rootElement = newRoot;
  } catch (e) {
    throw Error(`within Error: Selector '${selector}' was rejected:\n${e}`);
  }

  await func();

  config.rootElement = oldRoot;
}

(Note that return or async/await are necessary everywhere for the test not to finish prematurely.)

that is correct i made a mistake in my example

within("i'm going to go through all the ideas in comments and take ideas from initial implementation apply changes to RFC so that we can go to fcp",  async function() {
 await click('add heart emoji');
 return click('add smile emoji');
});

assert.list('comments').hasNItems(5);

I believe all these variants are too funky and force you into deep nesting (in complex cases)

given that this is working, because it does i use it in production app testing, would you consider it?

This scales well without nesting:

const file = findAllByLabel("File")[4];
const line = findAllByLabel("Line", file)[27];
const comment = findAllByLabel("Comment", line)[0];
return clickByLabel("Add heart emoji", comment);

in it's current form it will throw an error Ambiguous label found, this is on purpose because i believe that it is code stink in most cases and when it's not you have the option of using the within helper.

i would argue that the above is bad fro screen readers, because firstly it should say which line you are focused on and which file you are focused on.

within('File index.js', function() {
  return within('Line 27', function() { 
    return within("I'm going to go through all the ideas in comments and take ideas from initial implementation",      function() { 
        return click('Add heart emoji')
    })
  })
})

but the comment string is probably unique so you can use it without nesting.

within("I'm going to go through all the ideas in comments and take ideas from initial implementation", function() { 
   return click('Add heart emoji')
})

I have constrained the api on purpose because it is easier to open it later then to constrain it later, when we know all the best practices.

but at the same time i have allowed points of configuration where you can overcome issues of best practice to be able to enjoy the erganomics.

for example it is possible today to write this.

within('file index.js comment asd', function(){
 return click('heart emoji');
});

if when the community around the add is larger and we can get more opinions and have a better knowledge of how to craft a good accessible app, we can change the API, because we started with a constraint API.

I really appreciate your use case, because it tests the API, given that i would be happy to pair with you if you want, in order to test out these cases in practice. I think it would be extremely beneficial, as part of the evolution of the addon.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 13, 2018

@billybonks This trick means that you can't use vanilla click helper inside within (and any other stuff other than what's offered by semantic-dom-selectors). :(

@billybonks

This comment has been minimized.

Copy link

commented Sep 13, 2018

@lolmaus

This comment has been minimized.

Copy link

commented Sep 17, 2018

The "Say more" talk and this RFC have inspired me to convert my test codebase to semantic selectors, getting the following benefits:

  • A few dozens of universal, reusable Cucumber step implementations instead of thousands of very similar ones.
  • No need to maintain thousands of page object entries.
  • Step names are standardized.
  • The truth being tested is contained in step names directly instead of being implied under the hood in page objects and step implementations.

I've achieved this by essentially reverting to the Test Selectors pattern (see simplabs/ember-test-selectors and Ticketfly/ember-hook). But instead of using ad-hoc, machine-readable values, I went with semantic, human-readable ones.

I am extremely satisfied with the result and I would like to kill two birds with one stone and reuse those semantic attributes for a11y.

I researched a11y and ARIA a bit and discovered, that there is no straightforward way to do it. Unfortunately, the WAI-ARIA spec is extremely limiting. aria-label can't be used as a universal reference selector: it simply doesn't work for non-interactive/landmark/widget elements, and when it works it may shadow the text content of an element. 😫 And the spec provides no alternative: no attribute that can be used with any element, accessible via screen readers and not shadowing content.

I went back and re-read this RFC, hoping to find a way to fix this. I was discouraged to find out that it doesn't provide anything like that. It only provides click and fillIn! 😰

For this RFC to be useful for me, it should offer the following:

  • Semantically lookup any element, not just ARIA-driven ones. Real-life tests do way more things than clicking and filling. An RFC with such an ambitious name should not discriminate non-blind use cases.
  • Do anything with the element: inspect to make assertions, run mouse/keyboard events, etc.
  • Allow looking up a collection of elements, so that it can be iterated over, or its length asserted.
  • Allow looking up an element within a given parent (provided as a DOM node or string reference).
  • It should be obvious how to apply a selector to any given element in HTML.
  • It should be obvious how to compose a semantic selector for a given element in a test -- without reverting to ad-hoc, non-reusable phrasing.
  • Semantic selectors must be reasonably short, simple and efficient.
  • The RFC should not contain pitfalls like crashing a legit test case when two comments have identical text.
  • Any additional helpers such as within should be usable with normal test code, not only with the two magic helpers.

@jgwhite and @billybonks, I believe you should ask yourself these questions:

  1. Is it technically possible to achieve all of the above with a11y?
  2. Are you willing to?

If the answer is no, then I suggest that you rename this RFC to something less ambitious, so that the scope of the RFC title does not exceed the actual effort of the RFC. For example: "click and fillIn helper equivalents that look up elements via WAI-ARIA perceivable text".

And if the answer is no, you should also consider leaving this effort to an addon. I do acknowledge the importance of a11y and I understand the desire to impose a11y in an Ember best practice. But if this RFC is merged in this current form, are you sure it will not end up being known as "Those two weird helpers? Yeah, I tried them a couple of times but they're not matching my elements, dunno why. And I can't figure out how they can make my test code semantic... They are so perfect in theory, but my test code is just too down-to-earth for those idealistic helpers to fit in... I guess a11y is not for me."

@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 19, 2018

@lolmaus thank you for thinking about this so deeply and testing the ideas so extensively on a real project.

At this point you have arguably thought more deeply about this than I have but I’ll give you my takes:

aria-label can't be used as a universal reference selector: it simply doesn't work for non-interactive/landmark/widget elements, and when it works it may shadow the text content of an element

Am I understanding correctly that you’d like to use aria-label as a replacement for data-test-? I’d love to see some more examples of where you’d like to do this. My instinctive reaction is to ask “is there a perceivable element (heading, label etc.) that you could target instead?” but I'm sure there are circumstances I’m not anticipating.

I went back and re-read this RFC, hoping to find a way to fix this. I was discouraged to find out that it doesn't provide anything like that. It only provides click and fillIn! 😰

Perhaps the RFC doesn’t state this explicitly enough, but developers should always be able to switch to variants of find, click, and fillIn that use CSS selectors rather than semantic selectors if the situation requires it. The RFC proposes either separate imports or overloading.

  • Semantically lookup any element, not just ARIA-driven ones. Real-life tests do way more things than clicking and filling. An RFC with such an ambitious name should not discriminate non-blind use cases.

I think it’s perfectly acceptable to “drop down” to CSS selector based helpers where necessary.

  • Do anything with the element: inspect to make assertions, run mouse/keyboard events, etc.
  • Allow looking up a collection of elements, so that it can be iterated over, or its length asserted.
  • Allow looking up an element within a given parent (provided as a DOM node or string reference).

Do you think these points could be addressed by adding a semantic variant of find?

function find(label: string, scope?: HTMLElement): HTMLElement | null

@billybonks could we add such a function to the ember-semantic-test-helpers API?

  • It should be obvious how to apply a selector to any given element in HTML.

I don’t think this is universally attainable. As above, I’d say it’s perfectly acceptable to use a CSS selector if there is no perceivable semantic element that can easily be used as a reference.

  • It should be obvious how to compose a semantic selector for a given element in a test -- without reverting to ad-hoc, non-reusable phrasing.

As above, a semantic find helper may help with this.

let form = find('The legend of a form').form;
let field = find('Email', form);
await triggerKeyEvent(field, 'keypress', 127);

I’m using find but of course this could be named much more specific like findByPerceivableText.

  • Semantic selectors must be reasonably short, simple and efficient.

Would you mind giving some examples of cases where they’ve gotten unwieldy? (Arguably my example above of “ React with a heart to billybonks’ comment…” is a pretty unwieldy but I’d love to see some real-world examples too.

  • The RFC should not contain pitfalls like crashing a legit test case when two comments have identical text.

Haha, indeed. Though I would say the means of generating test fixture data contributes to such pitfalls.

  • Any additional helpers such as within should be usable with normal test code, not only with the two magic helpers.

Agreed. within is not in the scope of this RFC.


Let me say again how much I appreciate the thought you’ve put into this. I think it’s a fantastic outcome that you’re able to clearly provide an “acid test” or “acid thought experiment” that can help us determine whether it’s feasible to move forward with this approach as the default or not.

If this RFC ends up not being merged and we take the addon route, I’d say that’s still a fantastic outcome. ember-semantic-test-helpers came into existence as a result of the RFC process and that is already more than I could have hoped for. Not to mention all the fantastic conversation in the comments. As @kellyselden commented towards the beginning of the process, maybe it’s better to prove the idea in the wild via the addon.

Personally, I feel strongly that we can strive for “semantic by default” as the ultimate goal. I don’t think it would need to be a universal solution—abstractions always need escape hatches, after all—but if that’s the topic of another RFC in a year’s time, so be it.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 19, 2018

@jgwhite Thank you for tolerating my manner. I often get bitter and aggressive, while internally I perceive my language as passionate and truth-seeking, not realizing how offensive it may be until too late. I really appreciate you not taking it personally. 🙇

Am I understanding correctly that you’d like to use aria-label as a replacement for data-test-?

Yes, that was what I attempted at certain point. I wanted to kill two birds with one stone: have semantic labels and also improve a11y (in a quite imperfect but still valuable way).

But then @vvainio came and showed me this comment by @jared-w-smith:

Another important warning is that aria-label and aria-labelledby will override the visible text for certain elements, such as links and buttons. I often see developers that add labels to these with the intent of providing additional information or instructions without realizing that the link or button text will then be entirely inaccessible.

Too bad this fact is so obscure, almost secret knowledge. If not @jared-w-smith, we would fall into this trap.

I found no alternative to aria-label that could be picked up by screen readers without messing with comprehending the structure.

So we decided to switch back to data-test-. 😞


I’d love to see some more examples of where you’d like to do this. My instinctive reaction is to ask “is there a perceivable element (heading, label etc.) that you could target instead?” but I'm sure there are circumstances I’m not anticipating.

The matter is that I don't want to be limited to perceivable elements.

My dream here is to be able to replace selectors that hardcode DOM hierarchy with semantic, human readable ones. And they should be able to cover all possible testing situations, not just the ones covered by WAI-ARIA.

Note that this RFC is titled Semantic Test Selectors which corresponds to my hope and at the same time says nothing about a11y.


Perhaps the RFC doesn’t state this explicitly enough, but developers should always be able to switch to variants of find, click, and fillIn that use CSS selectors rather than semantic selectors if the situation requires it. The RFC proposes either separate imports or overloading.

I think it’s perfectly acceptable to “drop down” to CSS selector based helpers where necessary.

I agree that it is important to provide a way to fall back to CSS selector-based helpers.

But I believe a "Semantic Test Selectors" RFC must provide a way to fully rely on semantic selectors. If the RFC forces me to mix both approaches, I believe many develoeprs will find little value in it.


  • Do anything with the element

Do you think these points could be addressed by adding a semantic variant of find?

Yes. But it must be possible to look up any element, not only landmarks, widgetds and interactive elements.


@billybonks could we add such a function to the ember-semantic-test-helpers API?

I believe it already exists in tradegecko/semantic-dom-selectors. But it's not documented and it does not accept a context (parent scope) argument.


Would you mind giving some examples of cases where they’ve gotten unwieldy? (Arguably my example above of “ React with a heart to billybonks’ comment…” is a pretty unwieldy but I’d love to see some real-world examples too.

I can't since I still don't understand how perceivable text works. But the example you mentioned startles me: what if there are two similar comments that only differ starting from the 1000th character. Do my semantic selectors have to be 1K long? 😱

And what if two comments are entirely identical? There will be an error that the RFC provides no way to recover from, except for reverting to CSS selectors. And if I know I'll have to revert sooner or later, why bother doing it in the first place?

I must say that my teammate said that the RFC felt immature & naive, and I couldn't disagree.


Agreed. within is not in the scope of this RFC.

It could be. It just has to have a non-magical API. It could even have a counterpart in ember-test-helpers.

But the existence of within does not make find unnecessary.



I now feel that data-test- selectors satisfy my semantic needs entirely, except they don't let me kill two birds with one stone.

My effort to turn this RFC into the direction that satisfies my personal needs -- now seems naive. There is no way we can change how WAI-ARIA and screen readers work.

So if you also don't see a way to fully replace data-test- selectors for all possible cases with some a11y equivalents, then I believe you should rename this RFC: remove semantic and include a11y and perceivable text. And the summary of the RFC should explain that the scope of the RFC is rather narrow and that it is not offering a way to move away from CSS selectors entirely.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 19, 2018

I must say that my teammate said that the RFC felt immature & naive, and I couldn't disagree.

While I find the rest of your tone very balanced, it was pretty hard not to feel personally hurt by this statement. Honestly, you needn’t have included it.

@lolmaus

This comment has been minimized.

Copy link

commented Sep 19, 2018

@jgwhite Sorry, that was only a reference to the to the stage of this RFC development. RFCs mature over time and get rid of "teething problems" as they are discussed and applied to real-life use cases. All I meant was that this RFC was yet to resolve its issues.

@jgwhite

This comment has been minimized.

Copy link
Author

commented Sep 19, 2018

@lolmaus thanks for clarifying.

@MelSumner MelSumner self-assigned this Jan 28, 2019

@rwjblue rwjblue added the T-testing label Feb 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.