Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial --watch mode #2

Merged
merged 11 commits into from
Apr 22, 2021
Merged

initial --watch mode #2

merged 11 commits into from
Apr 22, 2021

Conversation

izelnakri
Copy link
Owner

No description provided.

@Krinkle
Copy link
Contributor

Krinkle commented Apr 20, 2021

[…] node.js watch mode is not supported due to limitations on qunit

Hi! I hope you don't mind that I'm watching the project. Could you elaborate on this limitation? The default QUnit CLI has a built-in watch mode nowadays.

Feel free to ignore my question if it would take from your momentum. I think qunitx I is about exploring new territory, so my question is not important. But know that I'm happy to answer any of your questions, and I don't mind making changes in core if needed. Thanks!

@izelnakri
Copy link
Owner Author

izelnakri commented Apr 20, 2021

Hi @Krinkle ! Thanks for your response! I was actually going to create an issue on qunit side for this, you've acted earlier than I did :)

This issue is complicated, I did check the cli code and most of the qunit code. The problem is require.cache deletion doesnt work properly for ESM imports(even dynamic imports), also currently qunit seems to be built/assuming browser page reload for test suite restarts.

Problem #1(internal QUnit state)

There are specific states that gets persisted inside the window.QUnit within its module closure that is not exposed as public API, we might need to spend a significant effort refactoring the internals to allow a public API to expose all the internal state for in-memory test suite reset functionality. Something like window.QUnit.reset(), because clearing window.QUnit.config.queue isnt enough, window.QUnit.start() also checks closured/cached variables too to see if its started already. I hacked the module/exposed API to also allow running multiple window.QUnit.start()s:

// on qunit/qunit.js#3453

    // ....
    only: test.only,
    stop: function stop() { // NOTE: instead probably we want a public window.QUnit.reset() that also resets internal test suite state
      config.autostart = false;
      globalStartCalled = false;
      runStarted = false;
    },
    start: function start(count) {
      if (config.current) {
        throw new Error("QUnit.start cannot be called inside a test context.");
      }
      
      // ....
    }

This allows running after file watch in node.js window.QUnit.start() however test suite immediately finishes with no succeeding or failing tests. This is happening due to internal state QUnit holds and only partially exposing it via window.QUnit.config. I've even completely reset the window.QUnit.config by doing a:

let initialConfigCache = deepcopy(window.QUnit.config)'
// ...

window.QUnit.config = initialConfigCache;

// ...

window.QUnit.start();

Even this doesnt work, so clearning window.QUnit.config.queue or any config related state itself isnt enough, some state still gets persisted after the test suite finishes it seems. Therefore if we introduce a window.QUnit.reset() that can completely clean/reset the internal state I can implement the node.js mode watching easily without even trying a costly and error prone require.cache cleanups.

Problem #2(Why cli cache reset approach didnt work):

In addition I spent many hours trying to clean the require.cache, in node.js ESM mode this doesnt work well, { type: "module" } caches static and dynamic imports differently than the default node.js/commonjs requires and its not documented, probably not even possible to clear. I even tried this following thing and it doesnt work:

import { createRequire } from 'module';

export function runTests(testFiles) {
  const require = createRequire(import.meta.url);

  // ....

  Object.keys(require.cache).forEach((module) => delete require.cache[module]);
  
   // Then import the test modules

   window.Qunit.start();   
}

Even this and many variants of this(completely reseting window.QUnit/reloading etc) doesnt work with ESM modules. I couldnt get it to work. I had seen you do something like these here afterward by coincidence as well during my reads :) https://github.com/qunitjs/qunit/blob/main/src/cli/run.js#L145

In other words I think we need in-memory QUnit resets/restarts, with something like this public API:

window.QUnit.reset(); // clean all internal state to the initial/empty state during module init
window.QUnit.start();

Not only this would be faster, less hacky and error prone & less node.js specific, it would also probably cleanup the internals even further by reorganizing the program state. Let me know if I missed anything, I'd love to implement this feature and already spent many days trying many different approaches as you can see :)

@izelnakri
Copy link
Owner Author

izelnakri commented Apr 20, 2021

Hi @Krinkle , I've spent few more hours today trying to do this in memory, the thing is we still need ESM cache busting even if we do things in memory because the files are watched and reloaded in node.js context. I'm 100% sure cache busting doesnt work on ESM modules at the moment. I hope node.js will allow busting the ESM cache. Then I can implement --watch mode for node.js, currently it works in this MR for --browser mode because browser mode uses esbuild to compile dependencies and create a new bundle each time.

Meanwhile we could add window.QUnit.reset() API core to reset all state in-memory for programmatic reruns(without page refresh). I've done something like this probably some actions arent needed and we could clear up the code:

    reset: function stop() {
      config = window.QUnit.config;
      config.queue.length = 0;
      config.modules.length = 0;
      config.currentModule = {
        name: "",
        tests: [],
        childModules: [],
        testsRun: 0,
        testsIgnored: 0,
        hooks: {
          before: [],
          beforeEach: [],
          afterEach: [],
          after: []
        }
      };
      globalSuite.childSuites.length = 0;
      delete globalSuite._startTime;
      delete globalSuite._endTime;
      config.currentModule.suiteReport = globalSuite;
      config.modules.push(config.currentModule);

      delete config.stats;
      delete config.started;
      delete config.updateRate;
      delete config.filter;
      delete config.depth;
      delete config.current;
      delete config.pageLoaded;
      delete config.timeoutHandler;
      delete config.timeout;
      delete config.pollution;

      ProcessingQueue.finished = false;
      config.autostart = false;
      globalStartCalled = false;
      runStarted = false;

      if (typeof config !== 'undefined') {
         // browser dist config refers to the internal config variable instead of window.QUnit.config unless assigned explicitly with
         // var config = window.QUnit.config;
      }
    },

@izelnakri
Copy link
Owner Author

@Krinkle apparently import('module.js?uuid') avoids caching, I just learned this. Trying the changes now, we probably still need few changes to core, but I'll let you know :) nodejs/node#38322 (comment)

@izelnakri
Copy link
Owner Author

izelnakri commented Apr 21, 2021

@Krinkle I was able to make it work only once I can call window.QUnit.reset() method: new hack/patch before window.QUnit.start(). Here is the patch I had to do, almost every line is needed(it reverses all the mutation). Maybe only few can be omitted but failed at removing most of it:

    reset: function reset() {
      ProcessingQueue.finished = false;
      globalStartCalled = false;
      runStarted = false;

      let QUnitConfig = window.QUnit.config;

      QUnitConfig.queue.length = 0;
      QUnitConfig.modules.length = 0;
      QUnitConfig.autostart = false;

      ['stats', 'started', 'updateRate', 'filter', 'depth', 'current', 'pageLoaded', 'timeoutHandler', 'timeout', 'pollution']
      .forEach((key) => delete QUnitConfig[key]);

      let suiteReport = QUnitConfig.currentModule.suiteReport;

      suiteReport.childSuites.length = 0;
      delete suiteReport._startTime;
      delete suiteReport._endTime;

      QUnitConfig.modules.push(QUnitConfig.currentModule);
    },
    start: function start(count) {
      // ... normal start definition

@izelnakri izelnakri merged commit ac7ab42 into main Apr 22, 2021
@Krinkle
Copy link
Contributor

Krinkle commented Apr 28, 2021

@izelnakri Hey, sorry for taking a while. I've been thinking about this and not had the time to properly sit down longer for this.

I think the QUnit CLI's current approach is not an example I'd recommend following. I didn't think about how it currently works when I referred to it. Looking at it now, it feels rather fragile to me. I think the only reason we currently do this is perhaps because our past selves thought it'd be marginally faster than spawning a sub process. Or that maybe we didn't yet want to deal with untangling certain parts of the CLI code.

Continuing to share global state feels wrong to me.

  • If we can trivially and easily purge all require/import caches for includes files that have changed, we'd still be re-using the state of files that didn't change. That seems bad practice for an end-user project (e.g. two test cases of MyFoo relying on global state in class Bar retaining between cases), but even worse so for a test runner where a user is expected to be running code that isn't yet perfect. If I mistakenly assign Bar.x = 1 statically and then fix the caller, it would remain dirty in the next run from the watch mode because Bar.js didn't change.
  • Suppose we purge all module caches (except those of qunit, qunitx, and our own dependencies, assuming they don't overlap), that still leaves the global state. E.g. if you left a global variable behind earlier, and fixed it, it would remain. Not to mention things like setTimeout and what not.
  • One may have an unfinished teardown/afterEach during development, etc.
  • And remember that aren't just re-runs, they're reruns from a watch mode that can and should trigger fairly often on known-incomplete code. E.g. due to auto-save, or subconcious pressing of ctrl-S, or when you save the first file or two files you know need to be modified, etc.
  • All this applies to the browser as well, possibly more so since global variables are easier to create by accident there, and because more things are tolerated or even best practice to do through global means. A UI library might attach a delegate event handler on document.body. Polyfills that remain from previous runs, possibly causing false positives, or false negatives.

So, yeah, I think the common approach that other frameworks follow (e.g. tape, Karma, airtap, etc.) is to use a fresh process in Node, and a fresh tab in the browser. If we're worried about latency, we can for example spawn the "next" process directly after a run finishes so that we're primed and ready to go for the next run.

@izelnakri
Copy link
Owner Author

izelnakri commented May 1, 2021

Hi @Krinkle, I appreciate your thoughtful responses, better late than never :)

I hacked qunit/made --watch mode and 2 modes(node & browser) working for qunitx. Here is how I achieved it:

  • This build script runs on each release to add the window.QUnit.reset() functionality, same code on my PR to qunit: https://github.com/izelnakri/qunitx/blob/main/scripts/hack-qunit.js#L11

  • window.QUnit.reset() was a requirement for node.js --watch mode because I dont want to setup QUnit to window.QUnit along with its event listeners on each file change: https://github.com/izelnakri/qunitx/blob/main/lib/setup/node-js-environment.js
    In addition to the runtime cost of reloading/re-evaluating the entire QUnit, I probably need to clear all the event listeners which doesnt seem ideal.

  • Then I discovered I still need to mutate/deal with QUnit internal state in order to implement --failFast and/or abort functionality on keyboard shortcut presses("ql" to run all tests, "ql" to run last test, "qf" to run last failing tests, havent finished releasing aborting yet but failFast now works.) In order to --failFast the test suite I found I have to do this:
    window.QUnit.config.queue.length = 0; . I found this to be the only reliable way to abort the test suite/fail it immediately if there is a failing test case because it still triggers done event listeners. The other solution I had previously tried to call the private function done() which was making the done event listener called twice. This behavior might also be a good idea have as a public API, something like window.QUnit.finish which allows developers to finish a test suite programmatically from outside for --failFast or abort behavior.

  • I think QUnit simply needs to provide a "low-level" api for developers to build tools around it, that way I won't need to append/hack some pre-built js code :) I suppose having an internal state is necessary for any test runtime/runner, however I don't think its necessarily bad to expose it publicly since QUnit gets exposed globally on window.QUnit. What we miss is the "low-level" customization and full exposure of the state that can also be tested/fully-covered. This might also help the unit tests as low-level apis could allow checking the entire qunit state on internal qunit test cases. Low-level apis for important and commonly used tools allows further experimentation in the community, gives more control to users and can actually help increase the quality of the unit tests.

=============================

  • The last point you've mentioned is a valid concern that I've not thought about :) Thankfully SPA frameworks like ember.js internally clears all event listeners on teardown however this could indeed be a problem for testing other frontend libraries. These libraries should always have the low/high-level api to allow for teardowns which could be called in qunit beforeEach/afterEach hooks. Also my browser --watch mode make a url revisit so it tearsdown and evaluates QUnit from scratch on each file change.

Lastly, I think we shouldn't completely follow other JS testing tools, I have an extensive experience with most of them and none of them got it fully right I think, particularly the concurrency story even including ava. Although we can definitely be inspired on certain things from each. Each testing framework has its own problems but this is something I can write in-depth on later/we can chat further if you are interested. I think QUnit has a particularly unique position in the testing space being designed for browser runtime first and now runs interchangably :) I'm able to finally dedicate some serious time on open-source projects so we can discuss/collaborate ;) Thanks again for your interest & investigation!

@Krinkle
Copy link
Contributor

Krinkle commented Jun 5, 2021

@izelnakri Thanks, I love all the work you're doing here. It's a fresh look (but informed by experience) at many things, and I think it's exactly what we need.

I do still feel a bit skeptical on re-running tests within the same Node or browser tab. Could you try to sell me on why/how it beneficial to developers for us to do this internally, compared to spawning a fresh process. It seems like that would simplify things and categorically remove a significant source of possible bugs and confusions.

I care about speed, a lot, but I have not previously found the cost of one process or tab to be significant in that way. We could even spawn these ahead of time when in watch mode, so that it has the bridge and test runner already warmed up and ready to go. This in addition to the general latency tolerance we have by running in the background on file changes (e.g. before one decides to look for the results etc).

I can imagine there being cases where purely the loading of the application source code takes a significant amount of time (the test runner can load ahead of time, and the test suites can be filtered as you do, but app source code is tricky indeed because we can't load that until we know what has changed). Is this the scenario that motivated you in this direction? Or is it somethin else? Do you have some data or examples I could look at to better understand why it takes as long as it does? If through some some magic we could reduce that time to within a duration you like, would that change your mind?


PS: We can continue to work side-by-side for a while too, for example, we could land and release https://github.com/qunitjs/qunit/pull/1598/files as an experimental feature first, and then you can start reducing complexity on this side. The only thing I ask is a couple of tests in that case so that it's easier for me not to accidentally break it in refactoring.

@izelnakri
Copy link
Owner Author

izelnakri commented Jun 6, 2021

Hi @Krinkle Im glad to see you're very interested in the experiment :) So far I've been able to succeed with my needs however as you might seen, I had to tweak/hack qunit to accomplish this, qunitx currently runs a vendored qunit to be able to clear the complete tests state on reruns. This is the only way I found that allowed Qunit.on events to be run(for tap format output on done, early --failFast on testDone, being able to kill a running qunit on keyboard shortcuts).

I suppose some of it could be done through ipc, however right now I accomplish this state reset/small mutation in my vendored/hacked qunit with almost no code on qunitx, it just makes an JS 'import()' with querystring to bust the es module cache.

I see the reason why you're hesitant to rewrite the process spawning logic, since much of the code is already written. However I find my approach to be simpler since it requires very little changes and no code for ipc(there just the websocket connection only during --browser mode to execute a function on the puppeteer instance). As I see it 'qunit' creates and maintains its state the cli runner shouldnt change it ideally.

Lastly and perhaps the most importantly, spawning a new process creates its own memory/state thus making state management even more difficult. For example, recently I introduced --before and --after script loading feature which allows qunitx users to introduce or change the global state before a test suite runs. First this seemed like a bad idea, then I realized it is a requirement in certain cases like when I want to run my pretender extension(@memserver/server) to be run on qunitx node or --browser mode interchangably, with 0 changes on test files, which is quite remarkable/amazing. Qunitx allows me to tests universal javascript libraries, for node tests I introduced a mocked dom for node.js mode with 'jsdom' before the test suite runs. This was quite possibly the first time it ever happened in JS ecosystem, 'pretender' even says it doesnt support node.js but I was able to hack it.

So in summary, I preferred handling test runtime in a single node.js process(when its not viewed in browser, thats seperate):

1- Existing in-process approach doesnt require process spawning, ipc, stdout management and process supervising. Just a node.js es import and existing qunit event registration code.
2- All the state is contained in single memory context. It eases the state management in my view as i find it to be the hardest thing generally in programming. Also allowed me to expose the global and qunit state to the users, if they want to adjust it for their tests(eg. my pretender node testing use-case) without having any extra scheduler or performance costs. It would also be easier to test and profile the system I suppose since everything happens in a single process/js vm.

=======

Also apologies for not introducing any tests for my proposed qunit code changes, I found it currently difficult to test the qunit internal state as its hidden behind the 'this' context, we might need to do significant changes to the existing tests and internal state management to be able to cover this fully with tests, thats why omitted them, I really wanted to include few tests on the qunit side ;)

As you might have seen I also didn't fully cover the qunitx source code as I found this specific code to not worth the effort cost/benefit wise, unless it gets significant usage/requires maturity/stableness during releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants