Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested arrays of objects are missing #17436

Closed
lianetoohey opened this issue Oct 23, 2018 — with docs.microsoft.com · 19 comments
Closed

Nested arrays of objects are missing #17436

lianetoohey opened this issue Oct 23, 2018 — with docs.microsoft.com · 19 comments

Comments

Copy link

While I can successfully connect to one of my CosmosDB collections, another is not working correctly. The Document in question includes a request object. Within request there should be multiple fields. When I sample my collection in Schema Editor, the resulting schema is missing any array of objects (or anything that includes an array of objects) that should be included under the request object. Behavior does not change if the same collection is re-sampled.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@Mike-Ubezzi-MSFT
Copy link
Contributor

@lianetoohey Thank you very much for the detailed feedback. Can you provide the following details:

  • Which Cosmos Service API are you consuming (SQL, MongoDB, Gremlin, etc...).
  • Can you provide an example request object so we can see how write queries are being done.

With this we can actively investigate.
Regards,
Mike

@lianetoohey
Copy link
Author

@Mike-Ubezzi-MSFT we are using SQL API. I have attached a sample document from the collection which includes the request object. Please let me know if any further information would be helpful. Many thanks!
example.txt

@Mike-Ubezzi-MSFT
Copy link
Contributor

And what is the query you are attempting to use to return the data set? Is the data partitioned...multiple collections?
Azure Cosmos DB SQL syntax reference

@lianetoohey
Copy link
Author

We're not using a query in the ODBC connector; we're sampling the entire collection, and it's not detecting all of our fields. Would it help if I shared the schema file? The data is not partitioned.

@Mike-Ubezzi-MSFT
Copy link
Contributor

Mike-Ubezzi-MSFT commented Oct 24, 2018

I posted a response to the wrong thread. Please reach out to askcosmosdb at Microsoft dotcom. Please include all the details asked for here plus all your schema details. They will be able to provide feedback. Thank you!

@lianetoohey
Copy link
Author

I'm happy to provide that information, though I'm not sure I understand the relationship between bulk import and ODBC/Schema Editor so I'm not quite sure how to answer. We're doing this entirely through wizards if that makes a difference; this isn't in our application anywhere.

If we were customizing those values, would they be in the connection string? I don't think we're specifying them (maybe max retries?) but for good measure this is what we've got:

Driver={Microsoft DocumentDB ODBC Driver};AuthenticationKey={redacted};Consistency={1};DESCRIPTION={};Driver={Microsoft DocumentDB ODBC Driver};DSNType={1};Host={https://oursite.documents.azure.com:443/};KeyEncrypted={true};LocalSchemaFile={C:\TA Hub\TAHub Schema.json};LogFileCount={50};LogFileSize={20};NumberOfRetries={5}

Let me know if this isn't what you were referring to.

Thanks,
Liane

@Mike-Ubezzi-MSFT
Copy link
Contributor

@lianetoohey I was working a 'bulk import' issue. I mistakenly posted my response to this issue instead of the issue it was intended.
As for your specific issue. The Json request object almost looks like data intended for a partitioned environment, where as Json with an array looks slightly different. I couldn't find a good example. Since you have a working environment and one that is not working as expected, by providing each instance name with the details you have shared here to the address listed, that communication could be super helpful to the Cosmos DB team.

@lianetoohey
Copy link
Author

@Mike-Ubezzi-MSFT sorry, I don't know what you're looking for. Can you please clarify?

@lianetoohey
Copy link
Author

@Mike-Ubezzi-MSFT sorry, I don't know what you're looking for. Can you please clarify?

Ah please disregard, I see that I missed another message from you in this thread. I'll reach out to the email address you provided. Thanks! - Liane

@Mike-Ubezzi-MSFT
Copy link
Contributor

@lianetoohey We will now proceed to close this thread. If there are further questions regarding this matter, please comment and we will gladly continue the discussion.

@ashutoshsharma15
Copy link

@Mike-Ubezzi-MSFT @lianetoohey I am facing same issue that sampling does not detect those field having nested array of key value pairs in json document. Is there a resolution to this issue?

@lianetoohey
Copy link
Author

@ashutoshsharma15 looks like a bug that Microsoft is working on. According to @balaksms they are finalizing some kind of fix as of yesterday.

@ashutoshsharma15
Copy link

Thanks @lianetoohey how can I track the status of bug to keep myself updated cc @Mike-Ubezzi-MSFT @balaksms

@awickham10
Copy link

We also seem to be facing this issue. Has it been resolved?

@lianetoohey
Copy link
Author

@awickham10 not as far as I'm aware

@kgmccann
Copy link

kgmccann commented Apr 3, 2019

Anyone figure out a way to work around this issue? I've tried to edit the JSON the Schema Editor spits out, but it is not obvious how (or not possible) to format for tables-nested-in-tables and I keep getting errors.

@lianetoohey
Copy link
Author

@kgmccann I tried the same thing but never got it to work. I wanted this for reporting so I'm going to try to feed my data source through some other connector, but I haven't figured out the details yet. Last I heard Microsoft was targeting a Q3 2019 fix for this.

Copy link

I am also facing same issue. Is it fixed in ODBC driver or is there a workaround for the arrays and nested arrays?
ex: In ADF it can be queried in the source of copy data as
"select na.field_name, c.* from collection_name c join na in nested_array"

@kgmccann
Copy link

@AzureSan I ended up writing a custom script, eschewing the ODBC driver entirely. Because the destination for me was Power BI, I could embed an R Script to get the job done as described in the comments here. In retrospect it would probably have been easier to write it in python, since there is an SDK for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants