Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to write tests #269

Open
7 tasks
Wouter1 opened this issue Feb 16, 2021 · 9 comments
Open
7 tasks

how to write tests #269

Wouter1 opened this issue Feb 16, 2021 · 9 comments
Assignees
Labels
question Further information is requested

Comments

@Wouter1
Copy link

Wouter1 commented Feb 16, 2021

What is your question?
My question is about mocking and testability. Generally it seems hard to test

Let's give a concrete example
Suppose I want to test the CollectionGoal of BW4T.

So I have to call the goal_reached(grid_world) function with -supposedly- a Mock of the grid_world

Now grid_world is a highly complex object. Just creating a Mock and filling it up a bit seems not going to work.

If you would follow general recommendations in such a situation eg ​https://www.ibm.com/developerworks/rational/library/oct06/pollice/index.html you would first rewrite the whole object into an interface that you can then properly mock. This seems out of the question as (1) python has no interface (2) extracting the interface calls will be difficult (3) interface calls are hidden behind parameter access calls.

For (1), a workaround could be to use an abstract class instead of an interface. For the others, I have no idea.

Because of that, I ended up creating not a mock but a real gridworld object. Which is created through the worldbuilder.

This gives additional problems with extra threads that seem to be created, that are still running at the end of the test, and then cause python to give errors that these threads were not terminated. I ignored these for now.

Next, I decided to completely hard code the blocks to be moved, to test if the goal is reached. Because the test code must be simple, not doing searches through gridworlds...

Now apparently CollectionGoal needs to keep track of timestamps, it does this itself by checking if there are blocks with unknown placement times and storing the times. That means I also have to work around this, by placing blocks one by one in the world and repeatedly calling CollectionGoal so that it can keep track of the block movements.

Overall, this looks way too complex for something simple as 'check that CollectionGoal returns true if the target blocks have been placed'.

I can't find any example test code in matrx for inspiration either.

How is testing supposed to be done? Am I missing something?

To what is your question related?

  • I am not sure if I should use MATRX.
  • I have a question about the idea behind MATRX.
  • I don't know how to do something with MATRX.
  • I don't know where to find something.
  • I think I found a bug, but I am not sure.
  • I want to contribute, but don't know where to start.
  • Something else.
@Wouter1 Wouter1 added the question Further information is requested label Feb 16, 2021
@Wouter1
Copy link
Author

Wouter1 commented Feb 16, 2021

A related question: every time I create a gridworld I seem to get different IDs
This is not workable in junit tests, I need to get consistent results each time.
So should I set the object_counter back to 0 before calling the builder?

@jwaa
Copy link
Member

jwaa commented Feb 17, 2021

To summarize your question; How can I test MATRX code that depends on complex interactions between objects/agents and the grid world? Where CollectionGoal is a concrete example.

The quick and clear answer is that supporting MATRX users in simplified and automated testing of their world is not easily possible, and is unlikely to be supported in future MATRX releases. Below I outline some potential solutions and argue why they will not actually solve the issue.

Potential Solution 1
You refer to three points that are indeed not easy to do within Python. Since they are heavily based on an object oriented approach that is supported by Python but not explicitely enforced (as opposed to Java for instance). A solution might be to utilize and support duck typing within MATRX similar to Python's ABC. This goes further then your idea of allowing GridWorld (or any other MATRX class) to function as an abstract class to any custom class specifically to support easy testing. However, this adds more complexity instead of simplification which would miss the point entirely.

Potential Solution 2
So another solution might be needed. An idea is to allow a MATRX user to create a GridWorld instance with an arbitrary state (e.g., object locations). However, the GridWorld acts as a state-machine, meaning that it relies on external input to track its state (here agents/objects affecting themselves on a grid coordinate system). One cannot simply create an arbitrary GridWorld and then ask the component you are interested in to do its thing and then check its output. Simply because you lack the history of how you got to that state stored in the GridWorld.

In the case of the CollectionGoal for instance, it needs to know when each object was found in a drop off zone. Why? Because it needs to be able to decide whether the object was placed in the right order of things. If your test then returns True or False, does not matter much as you only tested that the CollectionGoal works

So even if we implement methods that allow the creation of a GridWorld at an arbitrary state, it will miss the history needed to offer functionalities you would have in the actual GridWorld. Even if these functionalities are not needed for a test, you have no guarantee that the test is representative.

Potential Solution 3
Another solution might be to simulate the GridWorld from how it comes out of the WorldBuilder up to a certain specificied state. However, this brings the impossibility that you can never verify exactly how and if that specified can be reached. What might be possible is to use a specific type of agent that you, as a MATRX user, programmed to brind your world from its starting state to a desirable arbitrary state to test it. However, this just comes down to running your code and see if things work but then wrapped in a test method. Which is not really a solution at all.

Ofcourse it might be possible I missed an approach, so if I did; please say so! :)

@jwaa
Copy link
Member

jwaa commented Feb 17, 2021

A related question: every time I create a gridworld I seem to get different IDs
This is not workable in junit tests, I need to get consistent results each time.
So should I set the object_counter back to 0 before calling the builder?

This is by (crappy) design; the WorldBuilder functions as a factory of GridWorld instances following the factory design pattern. Each instance it output should thus be viewed as a different world instance.

However, we acknowledged some time a go that the heavy reliance, even on the MATRX user's side, on object_id is quite cumbersome for that user. It introduces a number of added complexities of which you raise one. Others include that your weblinks to the visualizations break if even something small changes in the WorldBuilder or that you can only know which agents should communicate on runtime.

Our current working solution is to hide this object_id from MATRX users and use object names instead. These are persistent over different world instances and user defined making them more tractable. However, this is a feature that is now planned for version 3. Which will not be ready some time soon sadly...

@Wouter1
Copy link
Author

Wouter1 commented Feb 17, 2021

@jwaa

So even if we implement methods that allow the creation of a GridWorld at an arbitrary state, it will miss the history needed to offer functionalities you would have in the actual GridWorld. Even if these functionalities are not needed for a test, you have no guarantee that the test is representative.

This seems to me the best way, next to mocking the gridworld instead of making a real one

If I could create a GridWorld at arbitrary state, that would include the time stamp history required in this case.

But I think the timestamp info is not part of GridWorld, it's part in fact of the CollectionGoal. So this is part of the code that I need to test.

The state should be clear and targeted at the thing I want to test. Unfortunately I don't see how to do this, because gridworlds are build using the builder; otherwise they don't work.

running a gridworld in a real simulation is not unit testing, but end-to-end testing. That may have its place but that's not what I'm after at this point.

@Wouter1
Copy link
Author

Wouter1 commented Feb 17, 2021

@jwaa ok, so I have to work around this issue with the object_id. How do I reset the object_id to ensure my test run gets consistent object_ids?

@jwaa
Copy link
Member

jwaa commented Feb 17, 2021

@jwaa ok, so I have to work around this issue with the object_id. How do I reset the object_id to ensure my test run gets consistent object_ids?

Have you tried resetting the global object counter (used to set the ID)?

builder = create_builder()  # Create your builder
world = builder.get_world()  # Get a world
...  # Do stuff with that world

EnvObject.object_counter = 0  # Reset the global counter for objects
other_world = builder.get_world() # Create another world
...  # Do stuff with this other world (supposedly having the same ID's)

Disclaimer; not tested and this might have unforeseen side effects.

Edit: Fixed object_counter being named global_counter

@jwaa
Copy link
Member

jwaa commented Feb 17, 2021

This seems to me the best way, next to mocking the gridworld instead of making a real one

If I could create a GridWorld at arbitrary state, that would include the time stamp history required in this case.

But I think the timestamp info is not part of GridWorld, it's part in fact of the CollectionGoal. So this is part of the code that I need to test.

The state should be clear and targeted at the thing I want to test. Unfortunately I don't see how to do this, because gridworlds are build using the builder; otherwise they don't work.

running a gridworld in a real simulation is not unit testing, but end-to-end testing. That may have its place but that's not what I'm after at this point.

How would you want to create a GridWorld at an arbitrary state?

Yes the tick is stored in CollectionGoal is part of that class, but only present if the GridWorld goes through its ticks and things change; the CollectionGoal basically monitors world progress and stores the things it finds relevant (e.g., which and when a block is placed on a drop zone).

So I'm curious how you foresee the feature of creating a GridWorld following a builder's blueprint, that is at an arbitrary point in time of its ticks and it is accounted for the "history" of how it got there (e.g., that some blocks were placed before others at their location) without simulating it up to that point?

Without accounting for this "history", the CollectionGoal for isntance cannot be tested appropriately. Since what you will test then is the case when all objects are place simultaneously on their locations, which is a quite uncommon case. The more common case is that objects are placed seperated over different ticks and for which the CollectionGoal should account for (and thus should be tested).

@jwaa
Copy link
Member

jwaa commented Feb 17, 2021

As a side note to all (future) readers; this discussion is scoped for unit tests that a MATRX user wants to write for MATRX code to test their own world, not for unit tests of MATRX components as a MATRX developer.

Within the package namespace the complexity is (somewhat) reduced although the same issues still stand. Just to a lesser degree, as within the package namespace one can unit test all the separate codeblocks that make-up a functionality instead of only testing the public methods. As the latter is more of E2E testing mixed with a Unit test approach.

@Wouter1
Copy link
Author

Wouter1 commented Feb 18, 2021

@jwaa

Thanks for all the help.

How would you want to create a GridWorld at an arbitrary state?
it should be as simple as calling init.
But this seemed not to work properly, that was the first reason to create this ticket
EnvObject.global_counter
Thanks. I suppose you mean EnvObject.object_counter; that seems to work.

supporting MATRX users in simplified and automated testing of their world is not easily possible,

Indeed. May I add that testing of all functions and classes that require a world is not easily possible either. And therefore that this has a much broader scope than suggested.

Because of these issues we halted our code unit testing efforts.

I highly recommend that this is addressed. Testing starts with testability. Testability is an important factor related to code quality, modularity , reliability and robustness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants