Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presto failed when MongoDB has a dollar sign as their field name #3379

Closed
chhsiao90 opened this issue Apr 8, 2020 · 6 comments
Closed

Presto failed when MongoDB has a dollar sign as their field name #3379

chhsiao90 opened this issue Apr 8, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@chhsiao90
Copy link
Member

According to Document - Field Names

Top-level field names cannot start with the dollar sign ($) character.
Otherwise, starting in MongoDB 3.6, the server permits storage of field names that contain dots (i.e. .) and dollar signs (i.e. $).

It's valid to have dollar signs in their field name.

For a given document:

{
	"_id" : ObjectId("552786262cec76ed95fd61cc"),
	"client" : "0.0.0.0",
	"keyUpdates" : 0,
	"lockStats" : {
		"timeAcquiringMicros" : {
			"r" : 2,
			"w" : 120711
		},
		"timeLockedMicros" : {
			"r" : 58,
			"w" : 0
		}
	},
	"millis" : 0,
	"nreturned" : 0,
	"ns" : "school2.system.indexes",
	"nscanned" : 3,
	"ntoreturn" : 0,
	"ntoskip" : 0,
	"numYield" : 0,
	"op" : "query",
	"query" : {
		"expireAfterSeconds" : {
			"$exists" : true
		}
	},
	"responseLength" : 20,
	"ts" : ISODate("2012-11-20T20:02:33.060Z"),
	"user" : ""
}

Presto generate schema

{
	"_id" : ObjectId("5e8d29acf5ddd83a5184c3a9"),
	"table" : "profiles",
	"fields" : [
		{
			"name" : "_id",
			"type" : "ObjectId",
			"hidden" : true
		},
		{
			"name" : "client",
			"type" : "varchar",
			"hidden" : false
		},
		{
			"name" : "keyUpdates",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "lockStats",
			"type" : "row(timeAcquiringMicros row(r bigint,w bigint),timeLockedMicros row(r bigint,w bigint))",
			"hidden" : false
		},
		{
			"name" : "millis",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "nreturned",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "ns",
			"type" : "varchar",
			"hidden" : false
		},
		{
			"name" : "nscanned",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "ntoreturn",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "ntoskip",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "numYield",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "op",
			"type" : "varchar",
			"hidden" : false
		},
		{
			"name" : "query",
			"type" : "row(expireAfterSeconds row($exists boolean))",
			"hidden" : false
		},
		{
			"name" : "responseLength",
			"type" : "bigint",
			"hidden" : false
		},
		{
			"name" : "ts",
			"type" : "timestamp",
			"hidden" : false
		},
		{
			"name" : "user",
			"type" : "varchar",
			"hidden" : false
		}
	]
}

And presto throw exception as dollar sign is not a valid character for field name.

2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr	com.google.common.util.concurrent.UncheckedExecutionException: io.prestosql.sql.parser.ParsingException: line 1:28: mismatched input '$'. Expecting: <identifier>, <type>
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache.get(LocalCache.java:3952)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.plugin.mongodb.MongoSession.getTable(MongoSession.java:154)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.plugin.mongodb.MongoMetadata.getTableHandle(MongoMetadata.java:80)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.plugin.mongodb.MongoMetadata.getTableHandle(MongoMetadata.java:55)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.metadata.MetadataManager.getTableHandle(MetadataManager.java:281)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.rewrite.ShowQueriesRewrite$Visitor.visitShowColumns(ShowQueriesRewrite.java:365)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.rewrite.ShowQueriesRewrite$Visitor.visitShowColumns(ShowQueriesRewrite.java:154)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.tree.ShowColumns.accept(ShowColumns.java:54)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
2020-04-08T09:32:28.113+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.rewrite.ShowQueriesRewrite.rewrite(ShowQueriesRewrite.java:151)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.rewrite.StatementRewrite.rewrite(StatementRewrite.java:57)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:79)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:74)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:218)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:177)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:95)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:719)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:120)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at java.lang.Thread.run(Thread.java:748)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr	Caused by: io.prestosql.sql.parser.ParsingException: line 1:28: mismatched input '$'. Expecting: <identifier>, <type>
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.ErrorHandler.syntaxError(ErrorHandler.java:107)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.DefaultErrorStrategy.reportNoViableAlternative(DefaultErrorStrategy.java:310)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:136)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlBaseParser.type(SqlBaseParser.java:10132)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlBaseParser.rowField(SqlBaseParser.java:10184)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlBaseParser.type(SqlBaseParser.java:9830)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlBaseParser.standaloneType(SqlBaseParser.java:385)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlParser.invokeParser(SqlParser.java:167)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlParser.createType(SqlParser.java:117)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.metadata.TypeRegistry.fromSqlType(TypeRegistry.java:171)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.metadata.MetadataManager.fromSqlType(MetadataManager.java:1218)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.type.InternalTypeManager.fromSqlType(InternalTypeManager.java:58)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.plugin.mongodb.MongoSession.buildColumnHandle(MongoSession.java:198)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.plugin.mongodb.MongoSession.loadTableSchema(MongoSession.java:184)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.CacheLoader$FunctionToCacheLoader.load(CacheLoader.java:165)
2020-04-08T09:32:28.114+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		... 27 more
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr	Caused by: org.antlr.v4.runtime.NoViableAltException
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.atn.ParserATNSimulator.noViableAlt(ParserATNSimulator.java:2028)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.atn.ParserATNSimulator.execATN(ParserATNSimulator.java:467)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at org.antlr.v4.runtime.atn.ParserATNSimulator.adaptivePredict(ParserATNSimulator.java:393)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		at io.prestosql.sql.parser.SqlBaseParser.type(SqlBaseParser.java:9818)
2020-04-08T09:32:28.115+0800	INFO	dispatcher-query-1	stderr		... 42 more
@chhsiao90
Copy link
Member Author

I can create a pull request if it's ok to just bypass field which starts with a dollar sign.

diff --git a/presto-mongodb/src/main/java/io/prestosql/plugin/mongodb/MongoSession.java b/presto-mongodb/src/main/java/io/prestosql/plugin/mongodb/MongoSession.java
index ea56a9b060..d1c264ef86 100644
--- a/presto-mongodb/src/main/java/io/prestosql/plugin/mongodb/MongoSession.java
+++ b/presto-mongodb/src/main/java/io/prestosql/plugin/mongodb/MongoSession.java
@@ -546,6 +546,10 @@ public class MongoSession
             List<TypeSignatureParameter> parameters = new ArrayList<>();
 
             for (String key : ((Document) value).keySet()) {
+                if (key.startsWith("$")) {
+                    return Optional.empty();
+                }
+
                 Optional<TypeSignature> fieldType = guessFieldType(((Document) value).get(key));
                 if (!fieldType.isPresent()) {
                     return Optional.empty();

@ebyhr ebyhr added the bug Something isn't working label Apr 8, 2020
@ebyhr
Copy link
Member

ebyhr commented Apr 8, 2020

Thanks for your investigation. Did you try quoting the field in _schema collection? If just quotation resolves it, it's better than bypassing.

@chhsiao90
Copy link
Member Author

I hadn't tried quoting the field in the _schema collection. I will try it, thanks for the advise.

@chhsiao90
Copy link
Member Author

chhsiao90 commented Apr 8, 2020

I tired quoting the field in the _schema, I can get the metadata of the table (describe table or information_schema), but I'm still not able to query the table.

The quoted field is de-quoted after typeManager.fromSqlType(typeString) was called.
row("$dollar" type) -> row($dollar type).
So same exception will be thrown during the REST api called like "updating task" or "getting task summary".

2020-04-08T11:24:57.531+0800	ERROR	remote-task-callback-1	io.prestosql.execution.StageStateMachine	Stage 20200408_032009_00000_a6spa.1 failed
io.prestosql.spi.PrestoException: Unexpected response from http://192.168.1.106:8000/v1/task/20200408_032009_00000_a6spa.1.0?summarize
	at io.prestosql.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:71)
	at io.prestosql.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:27)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1058)
	at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Unable to create class io.prestosql.execution.TaskInfo from JSON response:
[io.airlift.jaxrs.JsonMapperParsingException: Invalid json for Java type io.prestosql.server.TaskUpdateRequest
	at io.airlift.jaxrs.JsonMapper.readFrom(JsonMapper.java:156)

BTW, it looks like Mongo server supports the dollar sign (since 3.6.0), but the drivers are still not supported now. see SERVER-30575. (So I can't create a minimum reproducible unit test as java driver is not supported).

@ebyhr
Copy link
Member

ebyhr commented Apr 8, 2020

I can select the table as below. Could you share the latest _schema collection?

presto> select * from mongodb.tpch.test2;
             c1
----------------------------
 {x=1, field1={.y=1, $z=1}}

@chhsiao90
Copy link
Member Author

It looks like it's working now, maybe I'm doing something wrong. I will check if I can create a unit test for this issue and create a pull request.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants