Skip to content

my commit#4819

Closed
easyfan wants to merge 79 commits intoapache:masterfrom
easyfan:mr_master
Closed

my commit#4819
easyfan wants to merge 79 commits intoapache:masterfrom
easyfan:mr_master

Conversation

@easyfan
Copy link
Copy Markdown
Contributor

@easyfan easyfan commented Jul 26, 2019

<--

Contribution Checklist

  • Name the pull request in the form "[Issue XYZ][component] Title of the pull request", where XYZ should be replaced by the actual issue number.
    Skip Issue XYZ if there is no associated github issue for this pull request.
    Skip component if you are unsure about which is the best component. E.g. [docs] Fix typo in produce method.

  • Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.

  • Each pull request should address only one issue, not mix up code from multiple issues.

  • Each commit in the pull request has a meaningful commit message

  • Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.

(The sections below can be removed for hotfixes of typos)
-->

(If this PR fixes a github issue, please add Fixes #<xyz>.)

Fixes #

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #

Motivation

Explain here the context, and why you're making that change. What is the problem you're trying to solve.

Modifications

Describe the modifications you've done.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

ZhengFan and others added 30 commits July 2, 2019 12:28
…#4650)

* Issue apache#4638: Update Kafka connect-api to version 2.3.0

* remove 'block.on.buffer.full' property (already removed from kafka)
…e#4644)

* Improve and add authorization to function download and upload

* cleaning up

* fix bug
…he#4645)

* Allows consumer retrieve the sequence id that the producer set.

* fix comments.
Fixes apache#3216 
Implementation of offload to HDFS

### Motivation
Implementation of offload to HDFS

### Verifying this change
Add the test for this
…stats (apache#4615)

### Motivation

Broker throws NPE when pulsar-admin tries to fetch stats-internal for topic with reader.

```
Caused by: java.lang.NullPointerException
	at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.getProperties(ManagedCursorImpl.java:234) ~[classes/:?]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.lambda$getInternalStats$48(PersistentTopic.java:1461) ~[classes/:?]
	at java.lang.Iterable.forEach(Iterable.java:75) ~[?:1.8.0_92]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.getInternalStats(PersistentTopic.java:1446) ~[classes/:?]
	at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalGetInternalStats(PersistentTopicsBase.java:621) ~[classes/:?]
	at org.apache.pulsar.broker.admin.v2.PersistentTopics.getInternalStats(PersistentTopics.java:430) ~[classes/:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92]
```
### Motivation

Currently, the partitioned-stats API response includes stats for each partition. However, if the number of partitions and clients is large, the size of the response will be very large. In such cases, it is useful to have a query parameter to get a response that does not include stats for each partition.

```sh
$ curl -s http://localhost:8080/admin/persistent/sample/standalone/ns1/pt1/partitioned-stats | jq .

{
  "msgRateIn": 0,
  "msgThroughputIn": 0,
  "msgRateOut": 0,
  "msgThroughputOut": 0,
  "averageMsgSize": 0,
  "storageSize": 0,
  "publishers": [],
  "subscriptions": {
    "sub1": {
      "msgRateOut": 0,
      "msgThroughputOut": 0,
      "msgRateRedeliver": 0,
      "msgBacklog": 0,
      "blockedSubscriptionOnUnackedMsgs": false,
      "msgDelayed": 0,
      "unackedMessages": 0,
      "msgRateExpired": 0,
      "consumers": [],
      "isReplicated": false
    }
  },
  "replication": {},
  "metadata": {
    "partitions": 2
  },
  "partitions": {
    "persistent://sample/standalone/ns1/pt1-partition-1": {
      "msgRateIn": 0,
      "msgThroughputIn": 0,
      "msgRateOut": 0,
      "msgThroughputOut": 0,
      "averageMsgSize": 0,
      "storageSize": 0,
      "publishers": [],
      "subscriptions": {
        "sub1": {
          "msgRateOut": 0,
          "msgThroughputOut": 0,
          "msgRateRedeliver": 0,
          "msgBacklog": 0,
          "blockedSubscriptionOnUnackedMsgs": false,
          "msgDelayed": 0,
          "unackedMessages": 0,
          "msgRateExpired": 0,
          "consumers": [],
          "isReplicated": false
        }
      },
      "replication": {},
      "deduplicationStatus": "Disabled"
    },
    "persistent://sample/standalone/ns1/pt1-partition-0": {
      "msgRateIn": 0,
      "msgThroughputIn": 0,
      "msgRateOut": 0,
      "msgThroughputOut": 0,
      "averageMsgSize": 0,
      "storageSize": 0,
      "publishers": [],
      "subscriptions": {
        "sub1": {
          "msgRateOut": 0,
          "msgThroughputOut": 0,
          "msgRateRedeliver": 0,
          "msgBacklog": 0,
          "blockedSubscriptionOnUnackedMsgs": false,
          "msgDelayed": 0,
          "unackedMessages": 0,
          "msgRateExpired": 0,
          "consumers": [],
          "isReplicated": false
        }
      },
      "replication": {},
      "deduplicationStatus": "Disabled"
    }
  }
}
```

### Modifications

Added query parameter named `perPartition` to the partitioned-stats API. The default value is true.
* fix issue when submitting NAR via file url

* fix unit tests

* add more specific errors

* fix test
…ation (apache#4670)

* Add code samples for Protobuf and Avro schemas in java

* Update the code snippets for JSON, Protobuf, and Avro schemas with the preferred form.
### Motivation

Release 2.4.0 doc
Added release notes for 2.4.0 release
…apache#4664)

### Motivation

Fixes apache#4655 

Some compiler will have defined a macro for `DEBUG` and it will clash with the enum value name. Adding prefix to avoid the macro replacement.
Fixes apache#4228

Master Issue: apache#4228

### Motivation

Use Pulsar schema in pulsar kafka client.

### Modifications

Support schema of pulsar for pulsar kafka client

### Verifying this change

Add Unit test
…pache#4358)

### Motivation

Currently our JDBC Sink not support deletion and update events.
Support for delete and update events.

### Modifications

Support for delete and update events.
Add some document for JDBC Sink. apache#4073

### Verifying this change

local Unit Test pass.
Integration test pass
Update Debezium version to 0.9.5.Final
Fixes apache#4606

Master Issue: apache#4606

### Motivation

Refine the framework and implement language-specific code tabs

### Modifications

Upgrade the docusaurus version to 1.11.1 for support code tab
* Add static import statements for Assert to simplify the test in the presto module

* Use the preferred way of the schema's creation. The predicates and functions were converted to lambda
* Fixed the default port for https and http in admin client

* Fixed test expectation

* Removed space added by mistake
feca5bb changed topic delete logic to delete the schema when the
topic is deleted (though this only seems to be enabled for idle topic
GC). This exposed a bug in compatibility checking whereby if the a
subscription tries to attach to the topic, even if using the same
schema as had been used previously, a compatibility exception will be
thrown.

This is because the topic still appears to have a schema, even though
there is no actual schema data, just a tombstone. I've changed the logic
to return no schema if the schema read back is a tombstone.

The issue doesn't affect producers because the check was already
correct there.

I've also updated the check for transitive compatibility to remove the
prefix of schemas before the deleted schema. Previously this was
throwing an NPE on the broker as it couldn't decode the deleted
schema.

This issue was discovered by failures in the healthcheck. The check
period (5 minutes) was longer than the GC period (60 seconds). I would
expect it to hit quite often in other scenarios also.
* 2.4.0 release blog.

* fix comments.

* fix comments.

* fix comments.
…umentation (apache#4690)

* update master with correction to documentation for python reader

* Update site/docs/latest/clients/Python.md

Co-Authored-By: Matteo Merli <mmerli@apache.org>
freeznet and others added 29 commits July 26, 2019 09:47
### Motivation

Currently golang function needs to be compiled before deploy to pulsar, so the executable permission is required when function package is downloaded to local node from bookkeeper. This PR is intent to make golang function package executable after download from bookkeeper, to make sure the function is ok to run.
* [docs] add security warning on standalone doc

Add a fair warning on standalone get start document.  Since by default configure, Pulsar can be accessed from remote server without any authentication, encryption, authentication.  So a fair warning to the user is critical to avoid any unexpected security risks.
…ache#4717)

* add memory requirement and config tips for standalone mode

In the current standalone get start doc, there is no mentioning or link for the memory usage of the pulsar. 
User with limited free memory may encounter issue to start the Pulsar. 
Adding the tips on how to change the default required heap memory.
* Changed remove-backlog-quotas to remove-backlog-quota

* Changed remove-backlog-quotas to remove-backlog-quota

* Increased the consumerName field to varchar(256)

Signed-off-by: Yuvaraj Loganathan <uvaraj6@gmail.com>
### Motivation

There are some typos in the document content of about debezium, which affect users' use, so fix it.

### Modifications

Fix typos in document of debezium. And format content
### Motivation

Switched back to use the regular `java.util.concurrent.CopyOnWriteArrayList` instead of the class extending it since we don't really have any advantage in accessing the underlying array of objects. 

The reflection being used to get that field is giving errors on Java 12.
### Motivation


Currently, if the kubernetes namespace set to deploy functions in is different than the one in which brokers/workers reside, get status and stats doesn't work because the url for instances does not specify the namespace.
…he#4709)

*Motivation*

When using PulsarService or BrokerService for testing, it might require accessing
the components in PulsarService and BrokerService. This change is adding setters
and getters to access the components in PulsarService & BrokerService
### Motivation

After the changes in apache#3118, there has a been a sharp increase of memory utilization for the UnackedMessageTracker due to the time buckets being created. 

This is especially true when the acktimeout is set to a larger value (eg: 1h) where 3600 time-buckets are being created. This lead to use 20MB per partition even when no message is tracked.

Allowing to configure the tick time so that application can tune it based on needs.

Additionally, fixed the logic that keeps creating hash maps and throwing them away at each tick time iteration, since that creates a lot of garbage and doesn't take care of the fact that the hash maps are expanding based on the required capacity (so next time they are already of the "right" size). 

On a final note: the current default of 1sec seems very wasteful. Something like 10s should be more appropriate as default.
…e#4746)

### Motivation

`pulsar-function-go/conf` package apply `instance-conf-path` with default value `HOME_PATH+github.com/apache/pulsar/pulsar-function-go/conf/conf.yaml`, once function deployed, the running node may not have the yaml conf file exist, then go function will panic with `not found conf file` error. 

This PR changed the logic of config parsing, parse `confContent` first, then parse `confFilePath` if `confContent` empty.
* Convert anonymous functions to lambda

* Replacing lambda with anonymous implementation, because lambda cannot be mocked
…ns (apache#4539)

* [Pulsar-Client] Add Producer Numeric Properties Validation

* Aligned deprecated and new Producer API validations

* Deprecated and new Producer API validations are being aligned

* batchingMaxMessages C++ API is being aligned with Java API

* batchingMaxMessages Java API Validation is being aligned with C++ API

* Review comments are addressed

* Fix broken UTs
* Add Upgrade Guide to Apache Pulsar

*Changes*

Add a general upgrade guide to apache pulsar.

* Update the upgrade guide
### Motivation

Fix apache#4732 

### Modifications

Add options to rewrite the namespace delimiter, disable by default

Enable rewrite namespace delimiter can work well with superset:
<img width="1279" alt="superset" src="https://user-images.githubusercontent.com/12592133/61385412-f0f35700-a8e4-11e9-87b2-a31b62128b58.png">


### Does this pull request potentially affect one of the following parts:

*If `yes` was chosen, please highlight the changes*

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API: (no)
  - The schema: (no)
  - The default values of configurations: (no)
  - The wire protocol: (no)
  - The rest endpoints: (no)
  - The admin cli options: (no)
  - Anything that affects deployment: (no)

### Documentation

  - Does this pull request introduce a new feature? (no)
* Added more blog posts in the resources page

* Added 2 more posts
Master Issue: apache#4756

### Motivation

This is a continuation of apache#4765.

### Modifications

Added async rest handlers to the following APIs:
```
DELETE /admin/namespaces/{tenant}/{cluster}/{namespace}
PUT    /admin/namespaces/{tenant}/{cluster}/{namespace}/unload
POST   /admin/namespaces/{tenant}/{cluster}/{namespace}/clearBacklog
POST   /admin/namespaces/{tenant}/{cluster}/{namespace}/clearBacklog/{subscription}
POST   /admin/namespaces/{tenant}/{cluster}/{namespace}/unsubscribe/{subscription}

DELETE /admin/v2/namespaces/{tenant}/{namespace}
PUT    /admin/v2/namespaces/{tenant}/{namespace}/unload
POST   /admin/v2/namespaces/{tenant}/{namespace}/clearBacklog
POST   /admin/v2/namespaces/{tenant}/{namespace}/clearBacklog/{subscription}
POST   /admin/v2/namespaces/{tenant}/{namespace}/unsubscribe/{subscription}
```
…transaction (apache#4776)

* [Transaction][Buffer]Add new marker to show which message belongs to transaction
---

*Motivation*

Add new message type in the transaction including data and commit and abort maker in the transaction log.

*Modifications*

Add two new types of transaction messages.
TXN_COMMIT is the commit marker of the transaction.
TXN_ABORT is the abort marker of the transaction.
Add an independent Chapter for Pulsar Schema.

This is the first section—Get started.
*Motivation*

Add a few recent presentations to the resources page. They cover different topics:

- 2.4.0 release
- use case
- serverless
- spark + pulsar
- flink + pulsar
…EST APIs (3) (apache#4795)

* Process requests asynchronously on some REST APIs (3)

* Add async rest handler to API for expiring message on single topic subscription
* Simplified assert statements in the tests. Switch to usage of static imports in tests. (Part 1)

* Simplify assert statements in the tests and use the appropriate assert statements. Switch to usage of static imports in tests. Remove unused imports (Part 2)
…with the GIVEN result, but not a CERTAIN result.
@easyfan easyfan closed this Jul 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.