Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Paimon offers the following core capabilities:

## Try Paimon

If youre interested in playing around with Paimon, check out our
If you're interested in playing around with Paimon, check out our
quick start guide with [Flink]({{< ref "flink/quick-start" >}}) or [Spark]({{< ref "spark/quick-start" >}}). It provides a step by
step introduction to the APIs and guides you through real applications.

Expand Down
2 changes: 1 addition & 1 deletion docs/content/append-table/blob.md
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@ public class BlobDescriptorExample {
long fileSize = 2L * 1024 * 1024 * 1024; // 2GB

BlobDescriptor descriptor = new BlobDescriptor(externalUri, 0, fileSize);
// file io should be accessable to externalUri
// file io should be accessible to externalUri
FileIO fileIO = Table.fileIO();
UriReader uriReader = UriReader.fromFile(fileIO);
Blob blob = Blob.fromDescriptor(uriReader, descriptor);
Expand Down
2 changes: 1 addition & 1 deletion docs/content/append-table/incremental-clustering.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ only support running Incremental Clustering in batch mode.

To run a Incremental Clustering job, follow these instructions.

You dont need to specify any clustering-related parameters when running Incremental Clustering,
You don't need to specify any clustering-related parameters when running Incremental Clustering,
these options are already defined as table options. If you need to change clustering settings, please update the corresponding table options.

{{< tabs "incremental-clustering" >}}
Expand Down
2 changes: 1 addition & 1 deletion docs/content/concepts/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ For streaming engines like Apache Flink, there are typically three types of conn
intermediate stages in this pipeline, to guarantee the latency stay
within seconds.
- OLAP system, such as ClickHouse, it receives processed data in
streaming fashion and serving users ad-hoc queries.
streaming fashion and serving user's ad-hoc queries.
- Batch storage, such as Apache Hive, it supports various operations
of the traditional batch processing, including `INSERT OVERWRITE`.

Expand Down
2 changes: 1 addition & 1 deletion docs/content/ecosystem/starrocks.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ SELECT * FROM paimon_catalog.test_db.partition_tbl$partitions;
## StarRocks to Paimon type mapping

This section lists all supported type conversion between StarRocks and Paimon.
All StarRockss data types can be found in this doc [StarRocks Data type overview](https://docs.starrocks.io/docs/sql-reference/data-types/).
All StarRocks's data types can be found in this doc [StarRocks Data type overview](https://docs.starrocks.io/docs/sql-reference/data-types/).

<table class="table table-bordered">
<thead>
Expand Down
2 changes: 1 addition & 1 deletion docs/content/flink/procedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -703,7 +703,7 @@ All available procedures are listed below.
<td>
To expire partitions. Argument:
<li>table: the target table identifier. Cannot be empty.</li>
<li>expiration_time: the expiration interval of a partition. A partition will be expired if its lifetime is over this value. Partition time is extracted from the partition value.</li>
<li>expiration_time: the expiration interval of a partition. A partition will be expired if it's lifetime is over this value. Partition time is extracted from the partition value.</li>
<li>timestamp_formatter: the formatter to format timestamp from string.</li>
<li>timestamp_pattern: the pattern to get a timestamp from partitions.</li>
<li>expire_strategy: specifies the expiration strategy for partition expiration, possible values: 'values-time' or 'update-time' , 'values-time' as default.</li>
Expand Down
2 changes: 1 addition & 1 deletion docs/content/learn-paimon/understand-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -496,5 +496,5 @@ Maybe you think the 5 files for the primary key table are actually okay, but the
may have 50 small files in a single bucket, which is very difficult to accept. Worse still, partitions that
are no longer active also keep so many small files.

Configure full-compaction.delta-commits perform full-compaction periodically in Flink writing. And it can ensure
Configure 'full-compaction.delta-commits' perform full-compaction periodically in Flink writing. And it can ensure
that partitions are full compacted before writing ends.
2 changes: 1 addition & 1 deletion docs/content/maintenance/filesystems.md
Original file line number Diff line number Diff line change
Expand Up @@ -391,7 +391,7 @@ Please refer to [Trino S3](https://trino.io/docs/current/object-storage/file-sys

### S3 Compliant Object Stores

The S3 Filesystem also support using S3 compliant object stores such as MinIO, Tencent's COS and IBMs Cloud Object
The S3 Filesystem also support using S3 compliant object stores such as MinIO, Tencent's COS and IBM's Cloud Object
Storage. Just configure your endpoint to the provider of the object store service.

```yaml
Expand Down
2 changes: 1 addition & 1 deletion docs/content/primary-key-table/chain-table.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Chain table is a new capability for primary key tables that transforms how you p
Imagine a scenario where you periodically store a full snapshot of data (for example, once a day), even
though only a small portion changes between snapshots. ODS binlog dump is a typical example of this pattern.

Taking a daily binlog dump job as an example. A batch job merges yesterdays full dataset with today’s
Taking a daily binlog dump job as an example. A batch job merges yesterday's full dataset with today's
incremental changes to produce a new full dataset. This approach has two clear drawbacks:
* Full computation: Merge operation includes all data, and it will involve shuffle, which results in poor performance.
* Full storage: Store a full set of data every day, and the changed data usually accounts for a very small proportion.
Expand Down
2 changes: 1 addition & 1 deletion docs/content/primary-key-table/changelog-producer.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ By specifying `'changelog-producer' = 'input'`, Paimon writers rely on their inp

## Lookup

If your input cant produce a complete changelog but you still want to get rid of the costly normalized operator, you
If your input can't produce a complete changelog but you still want to get rid of the costly normalized operator, you
may consider using the `'lookup'` changelog producer.

By specifying `'changelog-producer' = 'lookup'`, Paimon will generate changelog through `'lookup'` during compaction (You can also enable [Async Compaction]({{< ref "primary-key-table/compaction#asynchronous-compaction" >}})). By default, lookup compaction is performed before committing written data unless disabled by `write-only` property.
Expand Down
2 changes: 1 addition & 1 deletion docs/content/primary-key-table/merge-engine/aggregation.md
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ public static class BitmapContainsUDF extends ScalarFunction {

Use `fields.<field-name>.nested-key=pk0,pk1,...` to specify the primary keys of the nested table. If no keys, row will be appended to array<row>.

Use `fields.<field-name>.count-limit=<Interger>` to specify the maximum number of rows in the nested table. When no nested-key, it will select data
Use `fields.<field-name>.count-limit=<Integer>` to specify the maximum number of rows in the nested table. When no nested-key, it will select data
sequentially up to limit; but if nested-key is specified, it cannot guarantee the correctness of the aggregation result. This option can be used to
avoid abnormal input.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ SELECT *
FROM AGG;
-- output 1, 3, 2, 2, "1", 1, 2

-- g_1, g_3 are smaller, a should not beupdated
-- g_1, g_3 are smaller, a should not be updated
INSERT INTO AGG
VALUES (1, 3, 3, 2, '3', 3, 1);

Expand Down
2 changes: 1 addition & 1 deletion docs/content/program-api/cpp-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ format with maximum efficiency.

[Paimon C++](https://github.com/alibaba/paimon-cpp.git) is currently governed under Alibaba open source
community. You can checkout the [document](https://alibaba.github.io/paimon-cpp/getting_started.html)
for more details about envinroment settings.
for more details about environment settings.

```sh
git clone https://github.com/alibaba/paimon-cpp.git
Expand Down
2 changes: 1 addition & 1 deletion docs/content/project/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Contributing to Apache Paimon goes beyond writing code for the project. Below, w
<tbody>
<tr>
<td><span class="glyphicon glyphicon-exclamation-sign" aria-hidden="true"></span> Report Bug</td>
<td>To report a problem with Paimon, open <a href="https://github.com/apache/paimon/issues">Paimons issues</a>. <br/>
<td>To report a problem with Paimon, open <a href="https://github.com/apache/paimon/issues">Paimon's issues</a>. <br/>
Please give detailed information about the problem you encountered and, if possible, add a description that helps to reproduce the problem.</td>
</tr>
<tr>
Expand Down
2 changes: 1 addition & 1 deletion docs/content/spark/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ data.write.format("paimon")
You can achieve REPLACE TABLE semantics by setting the mode to `overwrite` with `saveAsTable` or `save`.

It first drops the existing table and then create a new one,
so you need to specify the tables properties or partition columns if needed.
so you need to specify the table's properties or partition columns if needed.

```scala
val data: DataFrame = ...
Expand Down
2 changes: 1 addition & 1 deletion docs/content/spark/procedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ This section introduce all available spark procedures about paimon.
<td>
To expire partitions. Argument:
<li>table: the target table identifier. Cannot be empty.</li>
<li>expiration_time: the expiration interval of a partition. A partition will be expired if its lifetime is over this value. Partition time is extracted from the partition value.</li>
<li>expiration_time: the expiration interval of a partition. A partition will be expired if it's lifetime is over this value. Partition time is extracted from the partition value.</li>
<li>timestamp_formatter: the formatter to format timestamp from string.</li>
<li>timestamp_pattern: the pattern to get a timestamp from partitions.</li>
<li>expire_strategy: specifies the expiration strategy for partition expiration, possible values: 'values-time' or 'update-time' , 'values-time' as default.</li>
Expand Down
4 changes: 2 additions & 2 deletions docs/content/spark/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ Alternatively, you can copy `paimon-spark-3.5_2.12-{{< version >}}.jar` under `s

{{< tab "Catalog" >}}

When starting `spark-sql`, use the following command to register Paimons Spark catalog with the name `paimon`. Table files of the warehouse is stored under `/tmp/paimon`.
When starting `spark-sql`, use the following command to register Paimon's Spark catalog with the name `paimon`. Table files of the warehouse is stored under `/tmp/paimon`.

```bash
spark-sql ... \
Expand All @@ -133,7 +133,7 @@ can use the `spark_catalog.${database_name}.${table_name}` to access Spark table

{{< tab "Generic Catalog" >}}

When starting `spark-sql`, use the following command to register Paimons Spark Generic catalog to replace Spark
When starting `spark-sql`, use the following command to register Paimon's Spark Generic catalog to replace Spark
default catalog `spark_catalog`. (default warehouse is Spark `spark.sql.warehouse.dir`)

Currently, it is only recommended to use `SparkGenericCatalog` in the case of Hive metastore, Paimon will infer
Expand Down
2 changes: 1 addition & 1 deletion docs/content/spark/sql-ddl.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ CREATE TABLE my_table (
```

Furthermore, if there is already data stored in the specified location, you can create the table without explicitly specifying the fields, partitions and props or other information.
In this case, the new table will inherit them all from the existing tables metadata.
In this case, the new table will inherit them all from the existing table's metadata.

However, if you manually specify them, you need to ensure that they are consistent with those of the existing table (props can be a subset). Therefore, it is strongly recommended not to specify them.

Expand Down
2 changes: 1 addition & 1 deletion docs/layouts/shortcodes/generated/core_configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -987,7 +987,7 @@
<td><h5>partition.expiration-time</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>Duration</td>
<td>The expiration interval of a partition. A partition will be expired if its lifetime is over this value. Partition time is extracted from the partition value.</td>
<td>The expiration interval of a partition. A partition will be expired if it's lifetime is over this value. Partition time is extracted from the partition value.</td>
</tr>
<tr>
<td><h5>partition.idle-time-to-report-statistic</h5></td>
Expand Down
2 changes: 1 addition & 1 deletion docs/layouts/shortcodes/generated/orc_configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
<td><h5>orc.timestamp-ltz.legacy.type</h5></td>
<td style="word-wrap: break-word;">true</td>
<td>Boolean</td>
<td>This option is used to be compatible with the paimon-orcs old behavior for the `timestamp_ltz` data type.</td>
<td>This option is used to be compatible with the paimon-orc's old behavior for the `timestamp_ltz` data type.</td>
</tr>
</tbody>
</table>
Original file line number Diff line number Diff line change
Expand Up @@ -1103,7 +1103,7 @@ public InlineElement getDescription() {
.noDefaultValue()
.withDescription(
"The expiration interval of a partition. A partition will be expired if"
+ " its lifetime is over this value. Partition time is extracted from"
+ " it's lifetime is over this value. Partition time is extracted from"
+ " the partition value.");

public static final ConfigOption<Duration> PARTITION_EXPIRATION_CHECK_INTERVAL =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,5 +47,5 @@ public class OrcOptions {
.booleanType()
.defaultValue(true)
.withDescription(
"This option is used to be compatible with the paimon-orcs old behavior for the `timestamp_ltz` data type.");
"This option is used to be compatible with the paimon-orc's old behavior for the `timestamp_ltz` data type.");
}
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ private static int getFilesWithIncompatibleLicenses(Path jar, Path jarRoot) thro
// JSON License
"The Software shall be used for Good, not Evil.",
// can sometimes be found in "funny" licenses
"Dont be evil"));
"Don't be evil"));
}

private static Collection<Pattern> asPatterns(String... texts) {
Expand Down