Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datetime fields - ISO vs non-ISO - How does the behaviour of Cosmos DB change? #52598

Closed
sdg002 opened this issue Apr 16, 2020 · 9 comments
Closed

Comments

@sdg002
Copy link

sdg002 commented Apr 16, 2020

Consider the following JSON document :


{
        "id": "09152014101",
        "OrderDate": "2014-09-15T23:14:25.7251173Z",
        "ShipDate": "2014-09-30T23:14:25.7251173Z",
        "SomeDate": "2014-09-30T23:14:25",
        "Total": 113.39
}

  • The fields OrderDate and ShipDate follow ISO format.

  • But the field SomeDate does not follow ISO format.

  • Does Cosmos DB index the fields SomeDate differently from ShipDate and OrderDate?

How does Cosmos DB differ when it comes to range querying in the following scenarios?

  • I could do a range query on the ShipDate using a SQL clause like ShipDate > "2014-09-30T23:14:25Z"
  • I could do a range query on the SomeDate using a SQL clause like SomeDate > 2014-09-30T00:00:00

It looks to me that storing date time fields as ISO strings is more driven by convention.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@sdg002
Copy link
Author

sdg002 commented Apr 16, 2020

I am unable to edit the title. I hit the Save button too quickly. Please, could the moderator update the title
Desired title
Datetime fields - ISO vs non-ISO - How does the behaviour of Cosmos DB change?

Thank you

@Mike-Ubezzi-MSFT Mike-Ubezzi-MSFT changed the title datetime Datetime fields - ISO vs non-ISO - How does the behaviour of Cosmos DB change? Apr 16, 2020
@Mike-Ubezzi-MSFT Mike-Ubezzi-MSFT self-assigned this Apr 16, 2020
@Mike-Ubezzi-MSFT
Copy link
Contributor

@sdg002 Thank you for the inquiry, and just to clarify your question you indicated the following:

  • The fields OrderDate and ShipDate follow ISO format.

  • But the field ShipDate does not follow ISO format.

  • Does Cosmos DB index the fields SomeDate differently from ShipDate and OrderDate?

The second item should read SomeDate, as you have detailed in the third list item. I was going to fix this but didn't want do so without clarifying.

As for your specific question, if you look at the Storing DateTimes section of the document it states the following:

Azure Cosmos DB supports JSON types such as - string, number, boolean, null, array, object. It does not directly support a DateTime type. Currently, Azure Cosmos DB doesn't support localization of dates. So, you need to store DateTimes as strings. The recommended format for DateTime strings in Azure Cosmos DB is YYYY-MM-DDThh:mm:ss.fffffffZ which follows the ISO 8601 UTC standard.

When you query within Cosmos DB only, it is treated as the same unless you attempted to filter on hours, minutes, or seconds. If you were to export the data to ANSI compliant SQL database and attempted to load those values into a DateTime column, the SomeDate value may not load correctly. Another instance is if you were to load this record into PowerBI, the SomeDate value may not be interpreted correctly and covert from UTC to a localized date/time value.

I believe this should answer your question. Please let me know if this does not answer your question or you need more information.

@sdg002
Copy link
Author

sdg002 commented Apr 16, 2020

Thanks for the quick response. Could you explain with an example -

When you query within Cosmos DB only, it is treated as the same unless you attempted to filter on hours, minutes, or seconds

Could you explain with an example what feature of Cosmos DB makes it possible to accomplish this scenario ?
E.g. Filtering on all records which have month = 12

@Mike-Ubezzi-MSFT
Copy link
Contributor

@sdg002 I took your example record to make 4 records, each with unique Date values per record (not per property) and I am able to filter on a specific date with the following:

SELECT * FROM c WHERE STARTSWITH(c.SomeDate, "2014-12")

Example_Records.txt

@Mike-Ubezzi-MSFT
Copy link
Contributor

If I run the following query on the Example_Records.txt dataset:

SELECT * FROM container c WHERE c.SomeDate > '2014-10-30T23:14:25'

I get the following record:

{
    "id": "09152014103",
    "OrderDate": "2014-12-15T23:14:25.7251173Z",
    "ShipDate": "2014-12-30T23:14:25.7251173Z",
    "SomeDate": "2014-12-30T23:14:25",
    "Total": 100,
    "_rid": "TsppAKzUV-8DAAAAAAAAAA==",
    "_self": "dbs/TsppAA==/colls/TsppAKzUV-8=/docs/TsppAKzUV-8DAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-14fc-aeb47f3f01d6\"",
    "_attachments": "attachments/",
    "_ts": 1587157854
}

@Mike-Ubezzi-MSFT
Copy link
Contributor

If I run the same query but with Less Than + Equals, I get the other three records minus the 4th dated in December:

SELECT * FROM container c WHERE c.SomeDate <= '2014-10-30T23:14:25'

{
    "id": "09152014101",
    "OrderDate": "2014-09-15T23:14:25.7251173Z",
    "ShipDate": "2014-09-30T23:14:25.7251173Z",
    "SomeDate": "2014-09-30T23:14:25",
    "Total": 113.39,
    "_rid": "TsppAKzUV-8BAAAAAAAAAA==",
    "_self": "dbs/TsppAA==/colls/TsppAKzUV-8=/docs/TsppAKzUV-8BAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-14fc-94d58b3401d6\"",
    "_attachments": "attachments/",
    "_ts": 1587157811
},
{
    "id": "09152014102",
    "OrderDate": "2014-10-15T23:14:25.7251173Z",
    "ShipDate": "2014-10-30T23:14:25.7251173Z",
    "SomeDate": "2014-10-30T23:14:25",
    "Total": 110,
    "_rid": "TsppAKzUV-8CAAAAAAAAAA==",
    "_self": "dbs/TsppAA==/colls/TsppAKzUV-8=/docs/TsppAKzUV-8CAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-14fc-a33e42a901d6\"",
    "_attachments": "attachments/",
    "_ts": 1587157835
},
{
    "id": "09152014104",
    "OrderDate": "2014-02-15T23:14:25.7251173Z",
    "ShipDate": "2014-02-30T23:14:25.7251173Z",
    "SomeDate": "2014-02-30T23:14:25",
    "Total": 50.39,
    "_rid": "TsppAKzUV-8EAAAAAAAAAA==",
    "_self": "dbs/TsppAA==/colls/TsppAKzUV-8=/docs/TsppAKzUV-8EAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-14fc-b940d03401d6\"",
    "_attachments": "attachments/",
    "_ts": 1587157872
}

@Mike-Ubezzi-MSFT
Copy link
Contributor

@sdg002 We will now proceed to close this thread. If there are further questions regarding this matter, please comment and we will gladly continue the discussion.

@sdg002
Copy link
Author

sdg002 commented Apr 21, 2020

@Mike-Ubezzi-MSFT
Thank you for sample JSON. I got the opportunity to load this into my Cosmos instance and have a play.

The original intent of my question was to find out whether the engine of Cosmos DB internally handles ISO formatted strings in a special way and not just as another string field.

Inspired by your sample, I extended your JSON by adding a new field AnotherDate as follows: This field stores the data format using a yyyy\MM\dd format.

[
    {
        "id": "09152014101",
        "SomeDate": "2014-09-30T23:14:25",
        "AnotherDate": "2014/09/30"
    },
    {
        "id": "09152014102",
        "SomeDate": "2014-10-30T23:14:25",
        "AnotherDate": "2014/10/30"
    },
    {
        "id": "09152014104",
        "SomeDate": "2014-02-30T23:14:25",
        "AnotherDate": "2014/02/30"
    },
    {
        "id": "09152014103",
        "SomeDate": "2014-12-30T23:14:25",
        "AnotherDate": "2014/12/30"
    }
]

When I run the following queries
SELECT c.id, c.SomeDate, c.AnotherDate FROM c where c.AnotherDate <= '2014/10/31'

and

SELECT c.id, c.SomeDate, c.AnotherDate FROM c where c.SomeDate <= '2014-10-31T23:14:25'

I get identical results

[
    {
        "id": "09152014101",
        "SomeDate": "2014-09-30T23:14:25",
        "AnotherDate": "2014/09/30"
    },
    {
        "id": "09152014102",
        "SomeDate": "2014-10-30T23:14:25",
        "AnotherDate": "2014/10/30"
    },
    {
        "id": "09152014104",
        "SomeDate": "2014-02-30T23:14:25",
        "AnotherDate": "2014/02/30"
    }
]

On the surface, I see no difference between an ISO date and the date string yyyy/MM/dd. In bothe cases Cosmos carried out a lexical string comparison. ISO has no part to play here.

I hope I was able to put my points clearly.

image
https://docs.microsoft.com/en-us/azure/cosmos-db/working-with-dates

  • Contrary to the documentation I was able to write range queries using non-ISO date.
  • Does Cosmos DB engine identify ISO strings and index them as a datetime?

thanks for your patience.

Copy link
Contributor

Hi @sdg002, your analysis above is correct! The reason we recommend using the ISO 8601 format is because this is the format that GetCurrentDateTime uses in Cosmos DB: https://docs.microsoft.com/en-us/azure/cosmos-db/sql-query-getcurrentdatetime

If you wanted to use a different date format, that would be fine. The field is indexed in the same way (a string) regardless of the date format you choose.
Filters and comparisons in your queries compare the string values so that's why we recommend that all dates are in UTC and have a consistent length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants