Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatetimeParseException for date expression #1

Closed
O1lpunch3r opened this issue Dec 14, 2020 · 17 comments
Closed

DatetimeParseException for date expression #1

O1lpunch3r opened this issue Dec 14, 2020 · 17 comments
Assignees
Labels
bug Something isn't working

Comments

@O1lpunch3r
Copy link

O1lpunch3r commented Dec 14, 2020

Hello,

i use the faker plugin in flink-sql-client within my standalone-cluster.

I created the following table:

CREATE TEMPORARY TABLE server_logs (
    client_ip STRING,
    client_identity STRING,
    userid STRING,
    user_agent STRING,
    log_time TIMESTAMP(3),
    request_line STRING,
    status_code STRING,
    size INT,
    WATERMARK FOR log_time AS log_time - INTERVAL '15' SECONDS
) WITH (
  'connector' = 'faker',
  'fields.client_ip.expression' = '#{Internet.publicIpV4Address}',
  'fields.client_identity.expression' =  '-',
  'fields.userid.expression' =  '-',
  'fields.user_agent.expression' = '#{Internet.userAgentAny}',
  'fields.log_time.expression' =  '#{date.past ''15'',''5'',''SECONDS''}',
  'fields.request_line.expression' = '#{regexify ''(GET|POST|PUT|PATCH){1}''} #{regexify ''(/search\.html|/login\.html|/prod\.html|cart\.html|/order\.html){1}''} #{regexify ''(HTTP/1\.1|HTTP/2|/HTTP/1\.0){1}''}',
  'fields.status_code.expression' = '#{regexify ''(200|201|204|400|401|403|301){1}''}',
  'fields.size.expression' = '#{number.numberBetween ''100'',''10000000''}'
);

Adding was successfull, but then i just wanted to select from this table

SELECT * FROM server_logs

But then the following error occured:

Flink SQL> select * from server_logs;
[ERROR] Could not execute SQL statement. Reason:
java.time.format.DateTimeParseException: Text 'Mon Dec 14 17:29:11 CET 2020' could not be parsed at index 2

The dateformat shown there is the recommeded from your README, so what is the failure?

@knaufk
Copy link
Owner

knaufk commented Dec 14, 2020

First of all, thanks for reaching out. That's odd. I just ran another test with Flink 1.12. and it is running.

  • Which Flink Version?
  • Are you sure you pulled the latest commit of this repository? I added DataTimeParsing quite late.
  • Does it happen immediately on the first row?

@O1lpunch3r
Copy link
Author

  • Flink version 1.11.2
  • downloaded jar form releases page (0.1.0) to /lib directory of sql-client
  • It happens immediately

Thanks for your quick response 👍

@O1lpunch3r
Copy link
Author

O1lpunch3r commented Dec 14, 2020

Here is the stack trace:

org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
	at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)
	at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)
	at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
	at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:185)
	at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:179)
	at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:503)
	at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:386)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
	at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
	at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
	at akka.actor.ActorCell.invoke(ActorCell.scala:561)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
	at akka.dispatch.Mailbox.run(Mailbox.scala:225)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.time.format.DateTimeParseException: Text 'Mon Dec 14 17:29:11 CET 2020' could not be parsed at index 2
	at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
	at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1777)
	at com.github.knaufk.flink.faker.FakerUtils.stringValueToType(FakerUtils.java:47)
	at com.github.knaufk.flink.faker.FlinkFakerSourceFunction.generateNextRow(FlinkFakerSourceFunction.java:93)
	at com.github.knaufk.flink.faker.FlinkFakerSourceFunction.run(FlinkFakerSourceFunction.java:49)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)

On my CentOS standalone cluster, the error appears.

But when i execute it in a docker environment like this from Ververica, it is running. That's odd.

@knaufk
Copy link
Owner

knaufk commented Dec 15, 2020

It somehow seems to be related to your environment. I only found this SO question [1] that seems related, although the root cause differs. Maybe this helps in debugging. If there is anything I can do to help you debug this, let me know.

[1] https://stackoverflow.com/questions/56976974/datetimeparseexception-fails-on-one-host-works-on-another-same-jdk

@O1lpunch3r
Copy link
Author

Thanks for the research. This is a really weird problem. Unfortunately, neither the installation of AdoptOpenJDK hotspot (version 8 and 11) nor the downgrade of the rpm package tzdata/tzdata-java did anything. Maybe you can reproduce it in a Centos environment.

@appleyuchi
Copy link

appleyuchi commented Jan 1, 2021

鬼话连篇

@appleyuchi
Copy link

I reproduced it in Flink1.12 with the following code:

CREATE TEMPORARY TABLE timestamp_example (
  `timestamp1` TIMESTAMP(3),
  `timestamp2` TIMESTAMP(3)
)
WITH (
  'connector' = 'faker', 
  'fields.timestamp1.expression' = '#{date.past ''15'',''SECONDS''}',
  'fields.timestamp2.expression' = '#{date.past ''15'',''5'',''SECONDS''}'
);

SELECT * FROM timestamp_example;

@appleyuchi
Copy link

maybe due to timezone?

@O1lpunch3r
Copy link
Author

Which operation system / Java JRE do you use for the sql-client and flink cluster env @appleyuchi ?

@appleyuchi
Copy link

appleyuchi commented Jan 2, 2021

Which operation system / Java JRE do you use for the sql-client and flink cluster env @appleyuchi ?

Environment version
System Ubuntu20.04
Flink(HA) 1.12
JDK 1.8.0_131
Zookeeper 3.6.0

openjdk is uninstalled(I'm sure about this)

My $FLINK_HOME/lib is as follows:
commons-cli-1.4.jar
flink-connector-hbase-2.2_2.12-1.12.0_jar
flink-connector-hive_2.12-1.12.0.jar
flink-connector-jdbc_2.12-1.12.0.jar
flink-connector-kafka_2.12-1.12.0.jar
flink-connector-redis_2.11-1.1.5.jar
flink-csv-1.12.0.jar
flink-dist_2.12-1.12.0.jar
flink-faker-0.1.0.jar
flink-json-1.12.0.jar
flink-shaded-hadoop-3-uber-3.1.1.7.0.3.0-79-7.0.jar
flink-shaded-zookeeper-3.4.14.jar
flink-sql-connector-hbase-2.2_2.12-1.12.0.jar
flink-sql-connector-hive-3.1.2_2.12-1.12.0.jar
flink-table_2.12-1.12.0.jar
flink-table-blink_2.12-1.12.0.jar
flink-table-planner_2.12-1.12.0.jar
flink-table-planner-blink_2.12-1.12.0.jar
hadoop-yarn-api-3.1.2.jar
hive-common-3.1.2.jar
hive-exec-3.1.2.jar
javax.ws.rs-api-2.0.jar
kafka-clients-2.5.0.jar
libjars
log4j-1.2-api-2.12.1.jar
log4j-api-2.12.1.jar
log4j-core-2.12.1.jar
log4j-slf4j-impl-2.12.1.jar
mysql-connector-java-8.0.22.jar

I have added the following configuration into $FLINK_HOME/conf/flink-conf.yaml
env.java.opts.jobmanager: -Duser.timezone=Etc/GMT
env.java.opts.taskmanager: -Duser.timezone=Etc/GMT

um....
connector faker is down from https://github.com/knaufk/flink-faker
and mvn clean package and then put the jar under $FLINK_HOME/lib

@appleyuchi
Copy link

If you have time ,you can control my computer via TeamViewer to reproduce it.

@knaufk
Copy link
Owner

knaufk commented Jan 3, 2021

Hi @appleyuchi, @O1lpunch3r,

thanks for the continued interested in this issue. I was still not able to reproduce it (neither with Flink 1.12/11 or OpenJDK 8/11). Anyway, I have a guess that https://github.com/knaufk/flink-faker/tree/datetimeparsing might fix it. Could you check out this branch, mvn clean package and re-try your queries?

Thanks,

Konstantin

@appleyuchi
Copy link

I'll try it right now.

@appleyuchi
Copy link

appleyuchi commented Jan 4, 2021

perfect
Now everything is ok.
This is tested in China.
https://github.com/knaufk/flink-faker/tree/datetimeparsing

However,before you close this issue,
Could you tell us that you'll merge it in to the https://github.com/knaufk/flink-faker or NOT?
Thanks for your help

@knaufk knaufk added the bug Something isn't working label Jan 4, 2021
@knaufk
Copy link
Owner

knaufk commented Jan 4, 2021

Great! Thank you for testing.

If this also fixes it for @O1lpunch3r, I would merge to master and include it in the next release.

@O1lpunch3r
Copy link
Author

Great! It is working now. I have tested the example code of @appleyuchi and mine. Both are working with yout fix. Thank you @knaufk for fixing this. This connector is so great for testing.

@knaufk knaufk self-assigned this Jan 4, 2021
@knaufk knaufk closed this as completed in 378319d Jan 4, 2021
knaufk added a commit that referenced this issue Jan 4, 2021
Set DateTimeFormatter locale to us. Resolves #1.
@knaufk
Copy link
Owner

knaufk commented Jan 4, 2021

Thank you both for your help. I quickly released 0.1.1 with this fix included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants