Skip to content

02. Crudio Mission and Objectives

Chris Nurse edited this page Aug 9, 2022 · 4 revisions

About Crudio

Automatically creating test data, which you know nothing about, other than the structure, is a powerful way to build prototypes and test software systems.

All you need to do is describe the basic shape of your data model, like organisations (customers, perhaps), who have users. If your model uses pre-defined entities (schema of data records), then you can include and use the entity definitions, and be up and running in minutes.

You could take Crudio yourself and make it do lots of things. We love GraphQL, so we adopted the powerful combination of Postgres and Hasura. We did this because we can use two docker containers and in seconds we have a fast and modern way to access our data, either through an API or through the Hasura Console.

So imagine this...your horrible boss (seen the movie?), demands that, by tommorrow, you build a demonstration system that manages multiple organisations, each with a collection of users, where each organisation provides community services to people who are grouped by their specific needs.

How do you respond to that when you don't even have a database setup, you don't have any data to work with, and you don't have any APIs that a prototype app could use?!

You got nothing! So, how might you quickly create test data that looks sensible when you show a prototype application to users and seek their feedback?

Well the answer is, "fake it, til you make it!""

For our above example, we need a fake database, one which we can describe as needing organisations, users, programs, clients, and cohorts. We need to populate data tables, create users and organizations, randomly assign the users to organizations. Create fake programs, which are services that organisations deliver to their local communities. Create clients and chorts who are serviced through the program. Next, distribute clients into cohorts and cohorts to programs. Now we have a lot of fake people, in fake cohorts, in fake programs all assigned to fake organisations.

Essentially, just by describing the data you want, Crudio will create a rich data graph, generating data entities, and connecting them to each other... user->organisation client->cohort->program etc.

The data will be created rapidly, saved into a Postgres database, and then Hasura can be used to query and manage the data.

Phew! That's what Crudio does. It can't save the world, but it can help you change a Horrible Boss into a Happy Boss!

Key Objectives

These are the key objectives of automating the creation of test data:

  1. Fake data can be saved to a database.

    • We might normally just keep our fake data in memory, for automated testing.
    • But we can save it to a database, so that when we run demonstrations, the data is predictable and supports rehearsed and repeatable story telling.
    • This way in our example data model, we would see the same organisations every time we use the saved data with our prototype app.
    • You can even save the data to a JSON file and then use it for in memory testing.
  2. The definition of how to create the data and the data itself are managed as a unit, called a data model.

    • This way we can easily version control the data model with our prototype application and automated tests.
    • Tracing can be traced back data to the rules which created it, and this all sits alongside the tests which help us to be more confident that our prototype works.
  3. The data should make sense to users, and not be totally random. A data entity should be consistent for the context in which it is generated.

    • What?! Well consider how most people create random data, using random string. We would create random people, like "Joe Bloggs", but then give him an email address of "some.user@somewhere.com"
    • It would be better to create, "joe.bloggs@healthdepartment.com", which makes much more sense to users when they see the data.
    • For example, a generated Person with first and last name of Bob Smith has an email address of bob.smith@somewhere.com, as opposed to having random values.

Simply put, fake data should be sensible and useful to help build systems faster, that can be used to engage stakeholders to gather feedback. As far as practicable, the data should be coherent and not create questions about why it doesn't appear to be sensible and relevant to the problem domain.

New Data Generated Every Time

Bear this one point in mind, every time we run our data generation process, we get a completely new and unique set of data.

Let's say you run Crudio right now, and search the users table and find "Joe Bloggs", if you run Crudio again, moments later, you might not see "Joe Bloggs", because that name might not be randomly generated again.

This is why automated test data is so awesome. It can stop you from making assumptions about the values of data. You can create data which is good, bad, simple, challenging, all in the interests of ensuring you can present your app with meaningful data, and ensure your app is thoroughly tested.

Of course, it's not very good when you come to demonstrate your prototype app or service. In that case you want to know something about the data, like the names of organisations, and people and the roles and departments they work in.

Use the hard coded assignment feature, to write known values into the data model. This way, you can create a data model super fast, which has lots of data in it, but at the same time, you can seed values that you know are there. So you can tell a story about the CEO in an organisation, and every time the data is generated, you have confidence know there is at least one CEO that you know the name of.