Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve merge conflicts in vectorized parquet reader #33

Merged
merged 4 commits into from
Mar 18, 2016

Conversation

sameeragarwal
Copy link

No description provided.

yhuai and others added 4 commits March 18, 2016 13:40
… converted SQL query, we eagerly trigger analysis

## What changes were proposed in this pull request?
As part of testing generating SQL query from a analyzed SQL plan, we run the generated SQL for tests in HiveComparisonTest. This PR makes the generated SQL get eagerly analyzed. So, when a generated SQL has any analysis error, we can see the error message created by
```
                  case NonFatal(e) => fail(
                    s"""Failed to analyze the converted SQL string:
                        |
                        |# Original HiveQL query string:
                        |$queryString
                        |
                        |# Resolved query plan:
                        |${originalQuery.analyzed.treeString}
                        |
                        |# Converted SQL query string:
                        |$convertedSQL
                     """.stripMargin, e)
```

Right now, if we can parse a generated SQL but fail to analyze it, we will see error message generated by the following code (it only mentions that we cannot execute the original query, i.e. `queryString`).
```
            case e: Throwable =>
              val errorMessage =
                s"""
                  |Failed to execute query using catalyst:
                  |Error: ${e.getMessage}
                  |${stackTraceToString(e)}
                  |$queryString
                  |$query
                  |== HIVE - ${hive.size} row(s) ==
                  |${hive.mkString("\n")}
                """.stripMargin
```

## How was this patch tested?
Existing tests.

Author: Yin Huai <yhuai@databricks.com>

Closes apache#11825 from yhuai/SPARK-13972-follow-up.
…eader

## What changes were proposed in this pull request?

This PR cleans up the new parquet record reader with the following changes:

1. Removes the non-vectorized parquet reader code from `UnsafeRowParquetRecordReader`.
2. Removes the non-vectorized column reader code from `ColumnReader`.
3. Renames `UnsafeRowParquetRecordReader` to `VectorizedParquetRecordReader` and `ColumnReader` to `VectorizedColumnReader`
4. Deprecate `PARQUET_UNSAFE_ROW_RECORD_READER_ENABLED`

## How was this patch tested?

Refactoring only; Existing tests should reveal any problems.

Author: Sameer Agarwal <sameer@databricks.com>

Closes apache#11799 from sameeragarwal/vectorized-parquet.
marmbrus added a commit that referenced this pull request Mar 18, 2016
resolve merge conflicts in vectorized parquet reader
@marmbrus marmbrus merged commit 40037e6 into marmbrus:parquetReader Mar 18, 2016
marmbrus pushed a commit that referenced this pull request Dec 21, 2017
## What changes were proposed in this pull request?

This PR upgrade Janino version to 3.0.8. [Janino 3.0.8](https://janino-compiler.github.io/janino/changelog.html) includes an important fix to reduce the number of constant pool entries by using 'sipush' java bytecode.

* SIPUSH bytecode is not used for short integer constant [#33](janino-compiler/janino#33).

Please see detail in [this discussion thread](apache#19518 (comment)).

## How was this patch tested?

Existing tests

Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>

Closes apache#19890 from kiszk/SPARK-22688.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants