-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change type of publish_time to timestamp #4757
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jiazhai
approved these changes
Jul 18, 2019
run Integration Tests |
run java8 tests |
1 similar comment
run java8 tests |
sijie
approved these changes
Jul 18, 2019
run java8 tests |
jerrypeng
approved these changes
Jul 18, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codelipenghui thanks for making this fix! I tested your branch out and everything is working correctly.
rerun java8 tests |
easyfan
pushed a commit
to easyfan/pulsar
that referenced
this pull request
Jul 26, 2019
Fixes apache#4734 ### Motivation "publish_time" is Pulsar SQL internal column, as Pulsar only stores timestamps, it doesn’t store the timezone information. Use timestamp as "publish_time" type is more correct way in Pulsar SQL. ### Modifications Change type of publish_time to timestamp. ### Verifying this change predicate of publish_time is pushdown Use `__publish_time__` to trim messages: ``` SELECT COUNT(*) FROM "sql-test-1" WHERE "__publish_time__" >= TIMESTAMP '2019-07-18 17:26:50.119' AND "__publish_time__" < TIMESTAMP '2019-07-18 17:26:51.119'; ``` ![image](https://user-images.githubusercontent.com/12592133/61447301-43835080-a983-11e9-814b-bc2b378f02b9.png) Without `__publish_time__` predicate: ``` SELECT COUNT(*) FROM "sql-test-1"; ``` ![image](https://user-images.githubusercontent.com/12592133/61447427-82190b00-a983-11e9-8d3f-3bf2a4798047.png)
jiazhai
pushed a commit
that referenced
this pull request
Aug 28, 2019
Fixes #4734 ### Motivation "publish_time" is Pulsar SQL internal column, as Pulsar only stores timestamps, it doesn’t store the timezone information. Use timestamp as "publish_time" type is more correct way in Pulsar SQL. ### Modifications Change type of publish_time to timestamp. ### Verifying this change predicate of publish_time is pushdown Use `__publish_time__` to trim messages: ``` SELECT COUNT(*) FROM "sql-test-1" WHERE "__publish_time__" >= TIMESTAMP '2019-07-18 17:26:50.119' AND "__publish_time__" < TIMESTAMP '2019-07-18 17:26:51.119'; ``` ![image](https://user-images.githubusercontent.com/12592133/61447301-43835080-a983-11e9-814b-bc2b378f02b9.png) Without `__publish_time__` predicate: ``` SELECT COUNT(*) FROM "sql-test-1"; ``` ![image](https://user-images.githubusercontent.com/12592133/61447427-82190b00-a983-11e9-8d3f-3bf2a4798047.png) (cherry picked from commit 6f5416e)
sijie
pushed a commit
that referenced
this pull request
Feb 7, 2020
…d due to TTL (#6211) Fixes #5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is #4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in #4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the #4744 changes. The motivation of #4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in #4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382
aahmed-se
pushed a commit
to aahmed-se/pulsar
that referenced
this pull request
Feb 11, 2020
…d due to TTL (apache#6211) Fixes apache#5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382
tuteng
pushed a commit
to AmateurEvents/pulsar
that referenced
this pull request
Feb 23, 2020
…d due to TTL (apache#6211) Fixes apache#5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382
tuteng
pushed a commit
to AmateurEvents/pulsar
that referenced
this pull request
Mar 21, 2020
…d due to TTL (apache#6211) Fixes apache#5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382 (cherry picked from commit 54b39e6)
tuteng
pushed a commit
that referenced
this pull request
Apr 13, 2020
…d due to TTL (#6211) Fixes #5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is #4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in #4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the #4744 changes. The motivation of #4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in #4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382 (cherry picked from commit 54b39e6)
jiazhai
pushed a commit
to jiazhai/pulsar
that referenced
this pull request
May 18, 2020
…d due to TTL (apache#6211) Fixes apache#5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382 (cherry picked from commit 54b39e6)
huangdx0726
pushed a commit
to huangdx0726/pulsar
that referenced
this pull request
Aug 24, 2020
…d due to TTL (apache#6211) Fixes apache#5579 ### Motivation In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes. The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist. https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130 However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`. This means that unexpected message loss can occur. ### Modifications Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757. https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #4734
Motivation
"publish_time" is Pulsar SQL internal column, as Pulsar only stores timestamps, it doesn’t store the timezone information. Use timestamp as "publish_time" type is more correct way in Pulsar SQL.
Modifications
Change type of publish_time to timestamp.
Verifying this change
predicate of publish_time is pushdown
Use
__publish_time__
to trim messages:Without
__publish_time__
predicate:Does this pull request potentially affect one of the following parts:
If
yes
was chosen, please highlight the changesDocumentation