Skip to content
Jezz Santos edited this page Aug 13, 2020 · 19 revisions

In the design of QueryAny we aimed to achieve two things:

  1. A fluent language that developers can use to construct queries across various repository technologies.
  2. A programming model that repository developers can use to implement their IStorage<TEntity> repository implementations (using the QueryAny query language).

The Language

We wanted a familiar language to developers that looked and felt a little like LINQ, and a little bit like SQL, but not exactly like that, since this language cannot be either, and because we recognise that there are real constraints to what QueryAny must be generalised to, to be able to access: SQL, no-SQL and other types of repositories.

For example: we can't realistically expect to support all the optimization features of SQL (as a language) since No-SQL repositories would not be able to implement them (as SQL repositories would), so the language is necessarily restricted.

Note: It is worth noting that when we refer to No-SQL repositories, we are specifically excluding No-SQL Graph repositories.

The language of QueryAny was necessarily designed to be a fluent language in C#, to make it familiar to use, and to be highly discoverable for developers. (Unlike more flexible and variable languages like LINQ-to-SQL that often requires the developer to observe its capabilities in multiple examples in order to learn and access all its features). QueryAny will teach you how to use it!

Design Assumptions

We have found that it is always useful to state the assumptions that inform design decisions, so that consumers of the design can understand its applicability to their specific context, and expose its design, trade-offs, compromises and limitations.

No design is perfect, and we assume that too.

Developer Motivations

To utilize a generalised repository layer for access by your domain logic has many unquestionable advantages when designing and implementing many software products and services sustainably over time. These advantages are obtained by disciplines such as: separation of concerns, de-coupling, testability, inversion of control, layering, etc.

However, as with any design, there are general trade-offs that have to be made between: performance and maintainability to achieve these advantages. Some purests will be very uncomfortable with this idea, and we accept that not everyone is at the same level of understanding or has the same level of tolerance for this trade-off.

Note: It should be recognized that in general, as designs become more performance-optimized they also attract far more complexity and brittleness, and entering into this complexity should be deferred as long as possible to prevent premature increases in complexity that hinder velocity and progress ina codebase that may need to remain changeable.

We assume that the developer using QueryAny understands these kinds of trade-offs over-time in their software products and services, and is willing to compromise early performance-optimization over having a highly maintainable, flexible and ordered codebase. That they also understand that when it comes to optimizing certain areas of their codebases, in those areas they also compromise maintainability and will introduce more complexity to attain the maximum performance-efficiencies that various technologies may provide.

Note: Highly maintainable/ordered systems, can be performance-optimized in any area. But highly performance-optimized systems generally become far to complex and brittle to make maintainable some time later.

If these design principles are not generally accepted, then developers won't find much value in repository patterns that are achievable with libraries like QueryAny, and they will need to work far harder to keep complexity down in their codebases.

Developer Acceptance

These are the assumptions we have of the various repositories that are accessible by QueryAny:

  • We assume that a single repository (eg. database, or store) will contain one or more 'collections' or related entities (eg. tables, buckets, containers, etc).
  • We assume that a developer will wish to search for and retrieve a whole entity from a single collection in the repository, OR
  • We assume that a developer may want to search for and retrieve an aggregated entity from one or more collections the repository (explicitly or implicitly related in the same repository).
  • We do not assume that any repository implements referential integrity
  • We do not assume that any repository implements normalization.
  • We do not assume that any repository supports in-process processing of entities (like SQL Server does), and this may require many of these capabilities to be done by the client in memory.
  • We assume that memory is now a cheap commodity and compute resources have cheap access to very much of it, and that in-memory repositories are fast replacing slow disk bases stores (like SQL Server).

Capabilities & Features

These are the capabilities that we assume developers are comfortable with accepting as necessary limitations of QueryAny:

  • Be able to query for a whole single entity from a repository. Requires -> Entity Naming | Field Conditions
  • Be able to query for a partial single entity from a repository. Requires -> Entity Naming | Field Conditions | Field Selection
  • Be able to query for an whole aggregate entity from a repository. Requires -> Entity Naming | Field Conditions | Entity Joining
  • Be able to query for an partial aggregate entity from a repository. Requires -> Entity Naming | Field Conditions | Entity Joining | Field Selection
  • Be able to select all or some of the parts of a single/aggregate entity, for efficiency. Requires -> Field Selection
  • Be able to order by only one field in the result set. Requires -> Ordering
  • Be able to limit the number of entities returned, and offset to a 'page' of results in a large result sets. Requires -> Limiting | Offsetting

Note: There will be some capabilities that are supported by some types of repositories that are definitely not achievable across all types of repositories, and these are out of the scope of what QueryAny can do (or should) do.

  • Not navigating No-SQL graph repositories.
  • Not supporting all possible data types.
  • Not supporting deep querying into documents in document databases. Only querying top level fields of document.
  • Not optimizing for certain features of SQL:
    • OUTER and RIGHT joins
    • Sub select statements
    • Views, Stored Procedures, Functions, etc.

Language Syntax

See the Language Reference

The Model

The query language defines a structure of the query to implementers of the IStorage<TEntity> repository interface. This is the model used to drive the specific repository technology implementation.

For example: let's say that you wanted to write an implementation of IStorage<TEntity> for a JSON file system. You may decide to store all data in one JSON file, or you may decide to store a container of data in a different file (by the name of that container). You may decide to store the data as NV pairs or as a whole document. The design choice is entirely yours.

But to do any of this you will need an object model in code that describes the developers' query, to help you navigate your repository and extract the right query results.

QueryClause

The QueryClause defines the entire query and is the root of the model.

Each query will include the information about all the Entities involved (the primary entity and any joined entities), and will include all the data about each of the Where conditions, and any other supported features.

In essence, you would navigate into a specific entity to discover information about its specific Join, and its Selects.

Example: a fully fledged QueryClause like this:

            var query = Query.From<CustomerEntity>()
                .Join<ProfileEntity, string>(customer => customer.Id, profile => profile.CustomerId)
                .AndJoin<PreferencesEntity, string>(customer => customer.Id, preferences => preferences.CustomerId, JoinType.Outer)
                .Where(customer => customer.Id, ConditionOperator.EqualTo, "25")
                .OrWhere(subQuery =>
                    subQuery.Where(customer => customer.Id, ConditionOperator.NotEqualTo, "25")
                        .AndWhere(customer => customer.CreatedDateUc, ConditionOperator.GreaterThan, DateTime.UtcNow))
                .Select(customer => customer.Id)
                .Select(customer => customer.Name)
                .SelectFromJoin<PreferencesEntity>(customer => customer.AvatarPicture, preferences => preferences.Avatar)
                .Take(100).Skip(0)
                .OrderBy(customer => customer.Id)

Would have a JSON model like this:

{
    "PrimaryEntity": {
        "Name": "customer",
        "Selects":
        [{
                "EntityName": "customer",
                "FieldName": "Id",
            }, {
                "EntityName": "customer",
                "FieldName": "Name",
            }
        ]
    },
    "JoinedEntities":
    [{
            "Name": "profile",
            "Selects": [],
            "Join": {
                "Left": {
                    "EntityName": "customer",
                    "JoinedFieldName": "Id"
                },
                "Right": {
                    "EntityName": "profile",
                    "JoinedFieldName": "CustomerId"
                },
                "Type": "Inner"
            }
        }, {
            "Name": "preferences",
            "Selects":
            [{
                    "EntityName": "preferences",
                    "FieldName": "Avatar",
                    "JoinedEntityName": "customer",
                    "JoinedFieldName": "AvatarPicture",
                }
            ]
            "Join": {
                "Left": {
                    "EntityName": "customer",
                    "JoinedFieldName": "Id"
                },
                "Right": {
                    "EntityName": "preferences",
                    "JoinedFieldName": "CustomerId"
                },
                "Type": "Outer"
            }
        }
    ],
    "Wheres":
    [{
            "Condition": {
                "FieldName": "Id",
                "Operator": "EqualTo",
                "Value": "25"
            },
            "Operator": "None"
        }, {
            "Operator": "Or",
            "NestedWheres":
            [{
                    "Condition": {
                        "FieldName": "Id",
                        "Operator": "NotEqualTo",
                        "Value": "25"
                    },
                    "Operator": "None"
                }, {
                    "Condition": {
                        "FieldName": "CreatedDateUc",
                        "Operator": "GreaterThan",
                        "Value": "2020-06-13T12:00:00.000Z"
                    },
                    "Operator": "And"
                }
            ]
        }
    ],
    "ResultOptions": {
        "Skip": 0,
        "Take": 100,
        "OrderBy": {
        	"By": "Id",
        	"Direction" : "Ascending"
        }
    },
    "Options": {
        "IsEmpty": false
    }
}

Notes to Implementers

When implementing a Query or Search method that uses QueryAny.QueryClause as an input to define the query, the following guidance should help decide how to structure the implementation.

The AnyStorageBaseSpec is a test specification to comply with for testing any implementation of IStorage<TEntity>.

Supported Data Types

The following data types will need to be supported both in persistence (hydration/rehydration) and in the query language itself, as properties on an entity IPersistableEntity:

  • null (for any data type)
  • string
  • bool and bool?
  • Guid and Guid?
  • DateTime and DateTime?
  • DateTimeOffset and DateTimeOffset?
  • int and int?
  • long and long?
  • double and double?
  • byte[]
  • Guid and Guid?
  • IPersistableValueObject
  • Any other data type, known as a Complex data type. (IsComplexStorageType defines these types)

Dates and Times are UTC only

Not all storage technologies natively support storing and retrieving time zone information with dates and times. Most of these technologies may assume that only UTC dates and times require storing.

Constraint: Storing time zone information in DateTime is not supported in any repository. However, you can still use DateTimeOffset to store date and times with time zone information.

Thus, all repositories in this library will assume that DateTime values are first converted (by the caller) to UTC before storing.

These repositories will then store the UTC time.

They will then return only UTC DateTime values (Kind == DateTimeKind.Utc), except when the value is DateTime.MinValue this will be returned as Kind == DateTimeKind.Unspecified.

DateTime.MinValue is not meant to represent an real point in time, it is used as a default value only, so its Kind should not signify any meaning to any code consuming it. Thus we are choosing to specify that it hasKind == DateTimeKind.Unspecified to continue to support consuming code.

Query Method

Guidance for implementing a query method of your repository that takes this form: QueryResults<TEntity> Query(QueryClause<TEntity> query);

  1. if a null query is passed to a search function, then it should return empty results (but not null).
  2. query.Options.IsEmpty is a special case that indicates that there cannot be any Where conditions defined, and therefore the query should always return empty results.
  3. query.Wheres.Count == 0 indicates that there are no Where conditions defined, and therefore the query should return all results from the entity container. However, Join rules also apply. (ie. inner, left join rules still need to be honored).
  4. Consider local efficiencies of fetching data from the repository, for that specific repository. For example, with a SQL database, the database itself may be best to operate Joins, filtering, ordering etc. at the database. Whereas, an in memory database, may just join in memory, since there is no latency in fetching the data.
  5. If query.Selects.Count == 0 then populate all properties of the entity from container with their respective values from the store.
  6. If query.Selects.Count > 0 then populate only the selected properties with their respective values from the store (also, do mapping of property values from any joined selects) but do NOT set any other properties to their respective values, leaving them with their default values.
  7. If query.ResultOptions.Order.By == null then assume that ordering is always by CreatedAtUtc and Ascending.
  8. When ordering results, and there are selected fields, then only order results if the Order.By field is included in a combined list of fields from Select and SelectWithJoin statements. Otherwise order results according to the Order options.
  9. If query.SelectFromJoin is used and the joined value from the joined container either does not exist, then the primary container value keeps it original value, and is not overwritten.