-
Notifications
You must be signed in to change notification settings - Fork 1
Design
In the design of QueryAny we aimed to achieve two things:
- A fluent language that developers can use to construct queries across various repository technologies.
- A programming model that repository developers can use to implement their
IStorage<TEntity>
repository implementations (using the QueryAny query language).
We wanted a familiar language to developers that looked and felt a little like LINQ, and a little bit like SQL, but not exactly like that, since this language cannot be either, and because we recognise that there are real constraints to what QueryAny must be generalised to, to be able to access: SQL, no-SQL and other types of repositories.
For example: we can't realistically expect to support all the optimization features of SQL (as a language) since No-SQL repositories would not be able to implement them (as SQL repositories would), so the language is necessarily restricted.
Note: It is worth noting that when we refer to No-SQL repositories, we are specifically excluding No-SQL Graph repositories.
The language of QueryAny was necessarily designed to be a fluent language in C#, to make it familiar to use, and to be highly discoverable for developers. (Unlike more flexible and variable languages like LINQ-to-SQL that often requires the developer to observe its capabilities in multiple examples in order to learn and access all its features). QueryAny will teach you how to use it!
We have found that it is always useful to state the assumptions that inform design decisions, so that consumers of the design can understand its applicability to their specific context, and expose its design, trade-offs, compromises and limitations.
No design is perfect, and we assume that too.
To utilize a generalised repository layer for access by your domain logic has many unquestionable advantages when designing and implementing many software products and services sustainably over time. These advantages are obtained by disciplines such as: separation of concerns, de-coupling, testability, inversion of control, layering, etc.
However, as with any design, there are general trade-offs that have to be made between: performance and maintainability to achieve these advantages. Some purests will be very uncomfortable with this idea, and we accept that not everyone is at the same level of understanding or has the same level of tolerance for this trade-off.
Note: It should be recognized that in general, as designs become more performance-optimized they also attract far more complexity and brittleness, and entering into this complexity should be deferred as long as possible to prevent premature increases in complexity that hinder velocity and progress ina codebase that may need to remain changeable.
We assume that the developer using QueryAny understands these kinds of trade-offs over-time in their software products and services, and is willing to compromise early performance-optimization over having a highly maintainable, flexible and ordered codebase. That they also understand that when it comes to optimizing certain areas of their codebases, in those areas they also compromise maintainability and will introduce more complexity to attain the maximum performance-efficiencies that various technologies may provide.
Note: Highly maintainable/ordered systems, can be performance-optimized in any area. But highly performance-optimized systems generally become far to complex and brittle to make maintainable some time later.
If these design principles are not generally accepted, then developers won't find much value in repository patterns that are achievable with libraries like QueryAny, and they will need to work far harder to keep complexity down in their codebases.
These are the assumptions we have of the various repositories that are accessible by QueryAny:
- We assume that a single repository (eg. database, or store) will contain one or more 'collections' or related entities (eg. tables, buckets, containers, etc).
- We assume that a developer will wish to search for and retrieve a whole entity from a single collection in the repository, OR
- We assume that a developer may want to search for and retrieve an aggregated entity from one or more collections the repository (explicitly or implicitly related in the same repository).
- We do not assume that any repository implements referential integrity
- We do not assume that any repository implements normalization.
- We do not assume that any repository supports in-process processing of entities (like SQL Server does), and this may require many of these capabilities to be done by the client in memory.
- We assume that memory is now a cheap commodity and compute resources have cheap access to very much of it, and that in-memory repositories are fast replacing slow disk bases stores (like SQL Server).
These are the capabilities that we assume developers are comfortable with accepting as necessary limitations of QueryAny:
- Be able to query for a whole single entity from a repository. Requires ->
Entity Naming
|Field Conditions
- Be able to query for a partial single entity from a repository. Requires ->
Entity Naming
|Field Conditions
|Field Selection
- Be able to query for an whole aggregate entity from a repository. Requires ->
Entity Naming
|Field Conditions
|Entity Joining
- Be able to query for an partial aggregate entity from a repository. Requires ->
Entity Naming
|Field Conditions
|Entity Joining
|Field Selection
- Be able to select all or some of the parts of a single/aggregate entity, for efficiency. Requires ->
Field Selection
- Be able to order by only one field in the result set. Requires ->
Ordering
- Be able to limit the number of entities returned, and offset to a 'page' of results in a large result sets. Requires ->
Limiting | Offsetting
Note: There will be some capabilities that are supported by some types of repositories that are definitely not achievable across all types of repositories, and these are out of the scope of what QueryAny can do (or should) do.
- Not navigating No-SQL graph repositories.
- Not supporting all possible data types.
- Not supporting deep querying into documents in document databases. Only querying top level fields of document.
- Not optimizing for certain features of SQL:
- OUTER and RIGHT joins
- Sub select statements
- Views, Stored Procedures, Functions, etc.
See the Language Reference
The query language defines a structure of the query to implementers of the IStorage<TEntity>
repository interface. This is the model used to drive the specific repository technology implementation.
For example: let's say that you wanted to write an implementation of IStorage<TEntity>
for a JSON file system. You may decide to store all data in one JSON file, or you may decide to store a container of data in a different file (by the name of that container). You may decide to store the data as NV pairs or as a whole document. The design choice is entirely yours.
But to do any of this you will need an object model in code that describes the developers' query, to help you navigate your repository and extract the right query results.
The QueryClause
defines the entire query and is the root of the model.
Each query will include the information about all the Entities
involved (the primary entity and any joined entities), and will include all the data about each of the Where
conditions, and any other supported features.
In essence, you would navigate into a specific entity to discover information about its specific Join
, and its Selects
.
Example: a fully fledged QueryClause
like this:
var query = Query.From<CustomerEntity>()
.Join<ProfileEntity, string>(customer => customer.Id, profile => profile.CustomerId)
.AndJoin<PreferencesEntity, string>(customer => customer.Id, preferences => preferences.CustomerId, JoinType.Outer)
.Where(customer => customer.Id, ConditionOperator.EqualTo, "25")
.OrWhere(subQuery =>
subQuery.Where(customer => customer.Id, ConditionOperator.NotEqualTo, "25")
.AndWhere(customer => customer.CreatedDateUc, ConditionOperator.GreaterThan, DateTime.UtcNow))
.Select(customer => customer.Id)
.Select(customer => customer.Name)
.SelectFromJoin<PreferencesEntity>(customer => customer.AvatarPicture, preferences => preferences.Avatar)
.Take(100).Skip(0)
.OrderBy(customer => customer.Id)
Would have a JSON model like this:
{
"PrimaryEntity": {
"Name": "customer",
"Selects":
[{
"EntityName": "customer",
"FieldName": "Id",
}, {
"EntityName": "customer",
"FieldName": "Name",
}
]
},
"JoinedEntities":
[{
"Name": "profile",
"Selects": [],
"Join": {
"Left": {
"EntityName": "customer",
"JoinedFieldName": "Id"
},
"Right": {
"EntityName": "profile",
"JoinedFieldName": "CustomerId"
},
"Type": "Inner"
}
}, {
"Name": "preferences",
"Selects":
[{
"EntityName": "preferences",
"FieldName": "Avatar",
"JoinedEntityName": "customer",
"JoinedFieldName": "AvatarPicture",
}
]
"Join": {
"Left": {
"EntityName": "customer",
"JoinedFieldName": "Id"
},
"Right": {
"EntityName": "preferences",
"JoinedFieldName": "CustomerId"
},
"Type": "Outer"
}
}
],
"Wheres":
[{
"Condition": {
"FieldName": "Id",
"Operator": "EqualTo",
"Value": "25"
},
"Operator": "None"
}, {
"Operator": "Or",
"NestedWheres":
[{
"Condition": {
"FieldName": "Id",
"Operator": "NotEqualTo",
"Value": "25"
},
"Operator": "None"
}, {
"Condition": {
"FieldName": "CreatedDateUc",
"Operator": "GreaterThan",
"Value": "2020-06-13T12:00:00.000Z"
},
"Operator": "And"
}
]
}
],
"ResultOptions": {
"Skip": 0,
"Take": 100,
"OrderBy": {
"By": "Id",
"Direction" : "Ascending"
}
},
"Options": {
"IsEmpty": false
}
}
When implementing a Query or Search method that uses QueryAny.QueryClause
as an input to define the query, the following guidance should help decide how to structure the implementation.
The AnyStorageBaseSpec is a test specification to comply with for testing any implementation of
IStorage<TEntity>
.
The following data types will need to be supported both in persistence (hydration/rehydration) and in the query language itself, as properties on an entity IPersistableEntity
:
-
null
(for any data type) string
-
bool
andbool?
-
Guid
andGuid?
-
DateTime
andDateTime?
-
DateTimeOffset
andDateTimeOffset?
-
int
andint?
-
long
andlong?
-
double
anddouble?
byte[]
-
Guid
andGuid?
IPersistableValueObject
- Any other data type, known as a Complex data type. (
IsComplexStorageType
defines these types)
Not all storage technologies natively support storing and retrieving time zone information with dates and times. Most of these technologies may assume that only UTC dates and times require storing.
Constraint: Storing time zone information in
DateTime
is not supported in any repository. However, you can still useDateTimeOffset
to store date and times with time zone information.
Thus, all repositories in this library will assume that DateTime
values are first converted (by the caller) to UTC before storing.
These repositories will then store the UTC time.
They will then return only UTC DateTime
values (Kind == DateTimeKind.Utc
), except when the value is DateTime.MinValue
this will be returned as Kind == DateTimeKind.Unspecified
.
DateTime.MinValue
is not meant to represent an real point in time, it is used as a default value only, so itsKind
should not signify any meaning to any code consuming it. Thus we are choosing to specify that it hasKind == DateTimeKind.Unspecified
to continue to support consuming code.
Guidance for implementing a query method of your repository that takes this form: QueryResults<TEntity> Query(QueryClause<TEntity> query);
- if a null query is passed to a search function, then it should return empty results (but not null).
-
query.Options.IsEmpty
is a special case that indicates that there cannot be anyWhere
conditions defined, and therefore the query should always return empty results. -
query.Wheres.Count == 0
indicates that there are noWhere
conditions defined, and therefore the query should return all results from the entity container. However,Join
rules also apply. (ie. inner, left join rules still need to be honored). - Consider local efficiencies of fetching data from the repository, for that specific repository. For example, with a SQL database, the database itself may be best to operate Joins, filtering, ordering etc. at the database. Whereas, an in memory database, may just join in memory, since there is no latency in fetching the data.
- If
query.Selects.Count == 0
then populate all properties of the entity from container with their respective values from the store. - If
query.Selects.Count > 0
then populate only the selected properties with their respective values from the store (also, do mapping of property values from any joined selects) but do NOT set any other properties to their respective values, leaving them with their default values. - If
query.ResultOptions.Order.By == null
then assume that ordering is always byCreatedAtUtc
andAscending
. - When ordering results, and there are selected fields, then only order results if the
Order.By
field is included in a combined list of fields fromSelect
andSelectWithJoin
statements. Otherwise order results according to theOrder
options. - If
query.SelectFromJoin
is used and the joined value from the joined container either does not exist, then the primary container value keeps it original value, and is not overwritten.