my commit by easyfan · Pull Request #4819 · apache/pulsar

easyfan · 2019-07-26T02:10:35Z

<--

Contribution Checklist

Name the pull request in the form "[Issue XYZ][component] Title of the pull request", where XYZ should be replaced by the actual issue number.
Skip Issue XYZ if there is no associated github issue for this pull request.
Skip component if you are unsure about which is the best component. E.g. [docs] Fix typo in produce method.
Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
Each pull request should address only one issue, not mix up code from multiple issues.
Each commit in the pull request has a meaningful commit message
Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.

(The sections below can be removed for hotfixes of typos)
-->

(If this PR fixes a github issue, please add Fixes #<xyz>.)

Fixes #

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #

Motivation

Explain here the context, and why you're making that change. What is the problem you're trying to solve.

Modifications

Describe the modifications you've done.

Verifying this change

Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (10MB)
Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API: (yes / no)
The schema: (yes / no / don't know)
The default values of configurations: (yes / no)
The wire protocol: (yes / no)
The rest endpoints: (yes / no)
The admin cli options: (yes / no)
Anything that affects deployment: (yes / no / don't know)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
If a feature is not applicable for documentation, explain why?
If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

…ough batched messages sent timeout. (apache#4569)

…#4650) * Issue apache#4638: Update Kafka connect-api to version 2.3.0 * remove 'block.on.buffer.full' property (already removed from kafka)

…e#4644) * Improve and add authorization to function download and upload * cleaning up * fix bug

…he#4645) * Allows consumer retrieve the sequence id that the producer set. * fix comments.

Fixes apache#3216 Implementation of offload to HDFS ### Motivation Implementation of offload to HDFS ### Verifying this change Add the test for this

…stats (apache#4615) ### Motivation Broker throws NPE when pulsar-admin tries to fetch stats-internal for topic with reader. ``` Caused by: java.lang.NullPointerException at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.getProperties(ManagedCursorImpl.java:234) ~[classes/:?] at org.apache.pulsar.broker.service.persistent.PersistentTopic.lambda$getInternalStats$48(PersistentTopic.java:1461) ~[classes/:?] at java.lang.Iterable.forEach(Iterable.java:75) ~[?:1.8.0_92] at org.apache.pulsar.broker.service.persistent.PersistentTopic.getInternalStats(PersistentTopic.java:1446) ~[classes/:?] at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalGetInternalStats(PersistentTopicsBase.java:621) ~[classes/:?] at org.apache.pulsar.broker.admin.v2.PersistentTopics.getInternalStats(PersistentTopics.java:430) ~[classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92] ```

### Motivation Currently, the partitioned-stats API response includes stats for each partition. However, if the number of partitions and clients is large, the size of the response will be very large. In such cases, it is useful to have a query parameter to get a response that does not include stats for each partition. ```sh $ curl -s http://localhost:8080/admin/persistent/sample/standalone/ns1/pt1/partitioned-stats | jq . { "msgRateIn": 0, "msgThroughputIn": 0, "msgRateOut": 0, "msgThroughputOut": 0, "averageMsgSize": 0, "storageSize": 0, "publishers": [], "subscriptions": { "sub1": { "msgRateOut": 0, "msgThroughputOut": 0, "msgRateRedeliver": 0, "msgBacklog": 0, "blockedSubscriptionOnUnackedMsgs": false, "msgDelayed": 0, "unackedMessages": 0, "msgRateExpired": 0, "consumers": [], "isReplicated": false } }, "replication": {}, "metadata": { "partitions": 2 }, "partitions": { "persistent://sample/standalone/ns1/pt1-partition-1": { "msgRateIn": 0, "msgThroughputIn": 0, "msgRateOut": 0, "msgThroughputOut": 0, "averageMsgSize": 0, "storageSize": 0, "publishers": [], "subscriptions": { "sub1": { "msgRateOut": 0, "msgThroughputOut": 0, "msgRateRedeliver": 0, "msgBacklog": 0, "blockedSubscriptionOnUnackedMsgs": false, "msgDelayed": 0, "unackedMessages": 0, "msgRateExpired": 0, "consumers": [], "isReplicated": false } }, "replication": {}, "deduplicationStatus": "Disabled" }, "persistent://sample/standalone/ns1/pt1-partition-0": { "msgRateIn": 0, "msgThroughputIn": 0, "msgRateOut": 0, "msgThroughputOut": 0, "averageMsgSize": 0, "storageSize": 0, "publishers": [], "subscriptions": { "sub1": { "msgRateOut": 0, "msgThroughputOut": 0, "msgRateRedeliver": 0, "msgBacklog": 0, "blockedSubscriptionOnUnackedMsgs": false, "msgDelayed": 0, "unackedMessages": 0, "msgRateExpired": 0, "consumers": [], "isReplicated": false } }, "replication": {}, "deduplicationStatus": "Disabled" } } } ``` ### Modifications Added query parameter named `perPartition` to the partitioned-stats API. The default value is true.

…dy deleted/fenced (apache#4665)

* fix issue when submitting NAR via file url * fix unit tests * add more specific errors * fix test

…ation (apache#4670) * Add code samples for Protobuf and Avro schemas in java * Update the code snippets for JSON, Protobuf, and Avro schemas with the preferred form.

### Motivation Release 2.4.0 doc

…ge (apache#4674)

Added release notes for 2.4.0 release

…apache#4664) ### Motivation Fixes apache#4655 Some compiler will have defined a macro for `DEBUG` and it will clash with the enum value name. Adding prefix to avoid the macro replacement.

Fixes apache#4228 Master Issue: apache#4228 ### Motivation Use Pulsar schema in pulsar kafka client. ### Modifications Support schema of pulsar for pulsar kafka client ### Verifying this change Add Unit test

…pache#4358) ### Motivation Currently our JDBC Sink not support deletion and update events. Support for delete and update events. ### Modifications Support for delete and update events. Add some document for JDBC Sink. apache#4073 ### Verifying this change local Unit Test pass. Integration test pass

Update Debezium version to 0.9.5.Final

Fixes apache#4606 Master Issue: apache#4606 ### Motivation Refine the framework and implement language-specific code tabs ### Modifications Upgrade the docusaurus version to 1.11.1 for support code tab

* Add static import statements for Assert to simplify the test in the presto module * Use the preferred way of the schema's creation. The predicates and functions were converted to lambda

* Fixed the default port for https and http in admin client * Fixed test expectation * Removed space added by mistake

feca5bb changed topic delete logic to delete the schema when the topic is deleted (though this only seems to be enabled for idle topic GC). This exposed a bug in compatibility checking whereby if the a subscription tries to attach to the topic, even if using the same schema as had been used previously, a compatibility exception will be thrown. This is because the topic still appears to have a schema, even though there is no actual schema data, just a tombstone. I've changed the logic to return no schema if the schema read back is a tombstone. The issue doesn't affect producers because the check was already correct there. I've also updated the check for transitive compatibility to remove the prefix of schemas before the deleted schema. Previously this was throwing an NPE on the broker as it couldn't decode the deleted schema. This issue was discovered by failures in the healthcheck. The check period (5 minutes) was longer than the GC period (60 seconds). I would expect it to hit quite often in other scenarios also.

* 2.4.0 release blog. * fix comments. * fix comments. * fix comments.

…umentation (apache#4690) * update master with correction to documentation for python reader * Update site/docs/latest/clients/Python.md Co-Authored-By: Matteo Merli <mmerli@apache.org>

apache#4694)

…cs aggregation (apache#4691)

### Motivation Currently golang function needs to be compiled before deploy to pulsar, so the executable permission is required when function package is downloaded to local node from bookkeeper. This PR is intent to make golang function package executable after download from bookkeeper, to make sure the function is ok to run.

* [docs] add security warning on standalone doc Add a fair warning on standalone get start document. Since by default configure, Pulsar can be accessed from remote server without any authentication, encryption, authentication. So a fair warning to the user is critical to avoid any unexpected security risks.

…ache#4717) * add memory requirement and config tips for standalone mode In the current standalone get start doc, there is no mentioning or link for the memory usage of the pulsar. User with limited free memory may encounter issue to start the Pulsar. Adding the tips on how to change the default required heap memory.

* Changed remove-backlog-quotas to remove-backlog-quota * Changed remove-backlog-quotas to remove-backlog-quota * Increased the consumerName field to varchar(256) Signed-off-by: Yuvaraj Loganathan <uvaraj6@gmail.com>

### Motivation There are some typos in the document content of about debezium, which affect users' use, so fix it. ### Modifications Fix typos in document of debezium. And format content

### Motivation Switched back to use the regular `java.util.concurrent.CopyOnWriteArrayList` instead of the class extending it since we don't really have any advantage in accessing the underlying array of objects. The reflection being used to get that field is giving errors on Java 12.

### Motivation Currently, if the kubernetes namespace set to deploy functions in is different than the one in which brokers/workers reside, get status and stats doesn't work because the url for instances does not specify the namespace.

…he#4709) *Motivation* When using PulsarService or BrokerService for testing, it might require accessing the components in PulsarService and BrokerService. This change is adding setters and getters to access the components in PulsarService & BrokerService

### Motivation After the changes in apache#3118, there has a been a sharp increase of memory utilization for the UnackedMessageTracker due to the time buckets being created. This is especially true when the acktimeout is set to a larger value (eg: 1h) where 3600 time-buckets are being created. This lead to use 20MB per partition even when no message is tracked. Allowing to configure the tick time so that application can tune it based on needs. Additionally, fixed the logic that keeps creating hash maps and throwing them away at each tick time iteration, since that creates a lot of garbage and doesn't take care of the fact that the hash maps are expanding based on the required capacity (so next time they are already of the "right" size). On a final note: the current default of 1sec seems very wasteful. Something like 10s should be more appropriate as default.

…e#4746) ### Motivation `pulsar-function-go/conf` package apply `instance-conf-path` with default value `HOME_PATH+github.com/apache/pulsar/pulsar-function-go/conf/conf.yaml`, once function deployed, the running node may not have the yaml conf file exist, then go function will panic with `not found conf file` error. This PR changed the logic of config parsing, parse `confContent` first, then parse `confFilePath` if `confContent` empty.

* Convert anonymous functions to lambda * Replacing lambda with anonymous implementation, because lambda cannot be mocked

…ns (apache#4539) * [Pulsar-Client] Add Producer Numeric Properties Validation * Aligned deprecated and new Producer API validations * Deprecated and new Producer API validations are being aligned * batchingMaxMessages C++ API is being aligned with Java API * batchingMaxMessages Java API Validation is being aligned with C++ API * Review comments are addressed * Fix broken UTs

* Add Upgrade Guide to Apache Pulsar *Changes* Add a general upgrade guide to apache pulsar. * Update the upgrade guide

### Motivation Fix apache#4732 ### Modifications Add options to rewrite the namespace delimiter, disable by default Enable rewrite namespace delimiter can work well with superset: <img width="1279" alt="superset" src="https://user-images.githubusercontent.com/12592133/61385412-f0f35700-a8e4-11e9-87b2-a31b62128b58.png"> ### Does this pull request potentially affect one of the following parts: *If `yes` was chosen, please highlight the changes* - Dependencies (does it add or upgrade a dependency): (no) - The public API: (no) - The schema: (no) - The default values of configurations: (no) - The wire protocol: (no) - The rest endpoints: (no) - The admin cli options: (no) - Anything that affects deployment: (no) ### Documentation - Does this pull request introduce a new feature? (no)

* Added more blog posts in the resources page * Added 2 more posts

Master Issue: apache#4756 ### Motivation This is a continuation of apache#4765. ### Modifications Added async rest handlers to the following APIs: ``` DELETE /admin/namespaces/{tenant}/{cluster}/{namespace} PUT /admin/namespaces/{tenant}/{cluster}/{namespace}/unload POST /admin/namespaces/{tenant}/{cluster}/{namespace}/clearBacklog POST /admin/namespaces/{tenant}/{cluster}/{namespace}/clearBacklog/{subscription} POST /admin/namespaces/{tenant}/{cluster}/{namespace}/unsubscribe/{subscription} DELETE /admin/v2/namespaces/{tenant}/{namespace} PUT /admin/v2/namespaces/{tenant}/{namespace}/unload POST /admin/v2/namespaces/{tenant}/{namespace}/clearBacklog POST /admin/v2/namespaces/{tenant}/{namespace}/clearBacklog/{subscription} POST /admin/v2/namespaces/{tenant}/{namespace}/unsubscribe/{subscription} ```

…transaction (apache#4776) * [Transaction][Buffer]Add new marker to show which message belongs to transaction --- *Motivation* Add new message type in the transaction including data and commit and abort maker in the transaction log. *Modifications* Add two new types of transaction messages. TXN_COMMIT is the commit marker of the transaction. TXN_ABORT is the abort marker of the transaction.

fix apache#4707

### Motivation Continue the PR of apache#4151

Add an independent Chapter for Pulsar Schema. This is the first section—Get started.

*Motivation* Add a few recent presentations to the resources page. They cover different topics: - 2.4.0 release - use case - serverless - spark + pulsar - flink + pulsar

…EST APIs (3) (apache#4795) * Process requests asynchronously on some REST APIs (3) * Add async rest handler to API for expiring message on single topic subscription

* Simplified assert statements in the tests. Switch to usage of static imports in tests. (Part 1) * Simplify assert statements in the tests and use the appropriate assert statements. Switch to usage of static imports in tests. Remove unused imports (Part 2)

…with the GIVEN result, but not a CERTAIN result.

ZhengFan and others added 30 commits July 2, 2019 12:28

C++ client producer sendAsync() method will be blocked forever, if en…

234aca1

…ough batched messages sent timeout. (apache#4569)

Issue apache#4638 : Update Kafka connect-api to version 2.3.0 (apache…

e1fb74a

…#4650) * Issue apache#4638: Update Kafka connect-api to version 2.3.0 * remove 'block.on.buffer.full' property (already removed from kafka)

Improve and add authorization to function download and upload (apach…

42bd8e5

…e#4644) * Improve and add authorization to function download and upload * cleaning up * fix bug

Fixed managed ledger admin tool to work with Python3 (apache#4624)

b5d64f7

Allows consumer retrieve the sequence id that the producer set. (apac…

fa77510

…he#4645) * Allows consumer retrieve the sequence id that the producer set. * fix comments.

Fixed C++ client lookup over HTTP on standalone (apache#4625)

1f3b126

File system offload (apache#4403)

3b0a7b5

Fixes apache#3216 Implementation of offload to HDFS ### Motivation Implementation of offload to HDFS ### Verifying this change Add the test for this

A modified version (apache#4667)

ebfdf33

[pulsar-broker] avoid retrying deleting namespace when topic is alrea…

46f328a

…dy deleted/fenced (apache#4665)

fix issue when submitting NAR via file url (apache#4577)

3a299b1

* fix issue when submitting NAR via file url * fix unit tests * add more specific errors * fix test

[doc] Add code samples for Protobuf and Avro schemas in java document…

f3a7e08

…ation (apache#4670) * Add code samples for Protobuf and Avro schemas in java * Update the code snippets for JSON, Protobuf, and Avro schemas with the preferred form.

Doc release 2.4.0 (apache#4666)

cdd290e

### Motivation Release 2.4.0 doc

[doc] [Issue apache#4661] fix broken links on 'Develop connectors' pa…

aed42fd

…ge (apache#4674)

2.4.0 release notes

39da135

Added release notes for 2.4.0 release

Renamed C++ logger enum names to avoid conflicts with compiler macros (…

36e2cdd

…apache#4664) ### Motivation Fixes apache#4655 Some compiler will have defined a macro for `DEBUG` and it will clash with the enum value name. Adding prefix to avoid the macro replacement.

Add pulsar-io-influxdb to distribution. (apache#4678)

5d79087

Support Pulsar schema for pulsar kafka client wrapper (apache#4534)

37349ac

Fixes apache#4228 Master Issue: apache#4228 ### Motivation Use Pulsar schema in pulsar kafka client. ### Modifications Support schema of pulsar for pulsar kafka client ### Verifying this change Add Unit test

Update Debezium version to 0.9.5.Final (apache#4673)

265ee33

Update Debezium version to 0.9.5.Final

Upgrade docusaurus to 1.11.1 (apache#4682)

aaa2787

Fixes apache#4606 Master Issue: apache#4606 ### Motivation Refine the framework and implement language-specific code tabs ### Modifications Upgrade the docusaurus version to 1.11.1 for support code tab

Cleanup in the pulsar-log4j-appender project (apache#4681)

4fd11c2

Cleanup tests in the presto module (apache#4683)

121366a

* Add static import statements for Assert to simplify the test in the presto module * Use the preferred way of the schema's creation. The predicates and functions were converted to lambda

Fixed the default port for https and http in admin client (apache#4623)

90301ea

* Fixed the default port for https and http in admin client * Fixed test expectation * Removed space added by mistake

Blog of 2.4.0 release (apache#4677)

f0fad77

* 2.4.0 release blog. * fix comments. * fix comments. * fix comments.

[] [site2/docs] python api reader interface example correction to doc…

9e94920

…umentation (apache#4690) * update master with correction to documentation for python reader * Update site/docs/latest/clients/Python.md Co-Authored-By: Matteo Merli <mmerli@apache.org>

Add allowAutoTopicCreation to broker.conf and reference configuration. (

b0fa00d

apache#4694)

Added delayed messages in Prometheus when using namespace-level metri…

57af6ab

…cs aggregation (apache#4691)

freeznet and others added 29 commits July 26, 2019 09:47

fix: add anonymous role to proxy configuration (apache#4733)

b92eb4f

Increasing Dashboard consumerName field to 256 varchar (apache#4716)

7dc4669

* Changed remove-backlog-quotas to remove-backlog-quota * Changed remove-backlog-quotas to remove-backlog-quota * Increased the consumerName field to varchar(256) Signed-off-by: Yuvaraj Loganathan <uvaraj6@gmail.com>

Fix document of debezium (apache#4713)

f95edf0

### Motivation There are some typos in the document content of about debezium, which affect users' use, so fix it. ### Modifications Fix typos in document of debezium. And format content

Convert anonymous classes to lambda (apache#4703)

1a68e43

* Convert anonymous functions to lambda * Replacing lambda with anonymous implementation, because lambda cannot be mocked

Update README documentation (apache#4773)

fe08e9a

Add Upgrade Guide to Apache Pulsar (apache#4770)

94fe8f6

* Add Upgrade Guide to Apache Pulsar *Changes* Add a general upgrade guide to apache pulsar. * Update the upgrade guide

Added more blog posts in the resources page (apache#4774)

d49645d

* Added more blog posts in the resources page * Added 2 more posts

Fix：PulsarKafkaProducer is not thread safe (apache#4745)

a9387a5

fix apache#4707

Pulsar SQL supports pulsar's primitive schema (apache#4728)

cc3adb1

### Motivation Continue the PR of apache#4151

[Doc] Add Schema Chapter and Get Started Section (apache#4759)

368bfca

Add an independent Chapter for Pulsar Schema. This is the first section—Get started.

Add a few recent presentations to the resources page (apache#4783)

adfe7a0

*Motivation* Add a few recent presentations to the resources page. They cover different topics: - 2.4.0 release - use case - serverless - spark + pulsar - flink + pulsar

add basic authentication capabilities to Pulsar SQL (apache#4779)

3b71c73

[Issue apache#4756][broker] Process requests asynchronously on some R…

4f514b2

…EST APIs (3) (apache#4795) * Process requests asynchronously on some REST APIs (3) * Add async rest handler to API for expiring message on single topic subscription

upgrade git_commit_id_plugin to 3.0 (apache#4801)

536e158

Reuse ManagedLedgerFactory instances across SQL queries (apache#4813)

242bdba

Definitely this is a typo. This method is dealing the Failed Message …

58bc40c

…with the GIVEN result, but not a CERTAIN result.

easyfan closed this Jul 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

my commit#4819

my commit#4819
easyfan wants to merge 79 commits intoapache:masterfrom
easyfan:mr_master

easyfan commented Jul 26, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

easyfan commented Jul 26, 2019

Contribution Checklist

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants