[SPARK-8535][PySpark]PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name #7124

x1- · 2015-06-30T11:10:21Z

Because implicit name of pandas.columns are Int, but StructField json expect String.
So I think pandas.columns are should be convert to String.

issue

SPARK-8535 PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name

…String.

AmplabJenkins · 2015-06-30T11:12:11Z

Can one of the admins verify this patch?

JoshRosen · 2015-06-30T15:13:24Z

Jenkins this is ok to test

AmplabJenkins · 2015-06-30T15:17:13Z

Merged build triggered.

AmplabJenkins · 2015-06-30T15:17:24Z

Merged build started.

SparkQA · 2015-06-30T15:18:04Z

Test build #36145 has started for PR 7124 at commit ea1897d.

SparkQA · 2015-06-30T16:44:10Z

Test build #36145 has finished for PR 7124 at commit ea1897d.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-06-30T16:44:17Z

Merged build finished. Test FAILed.

davies · 2015-06-30T18:25:06Z

python/pyspark/sql/context.py

@@ -342,13 +342,15 @@ def createDataFrame(self, data, schema=None, samplingRatio=None):

        >>> sqlContext.createDataFrame(df.toPandas()).collect()  # doctest: +SKIP
        [Row(name=u'Alice', age=1)]
+        >>> sqlContext.createDataFrame(pandas.DataFrame([[1, 2]]).collect())


There is no pandas in jenkins, we need to skip the tests by # doctest: +SKIP.

Hey @shaneknapp, want to help us install Pandas? 😄

But PySpark SQL does not depends on pandas.

@JoshRosen @davies
I'm sorry.
I added # doctest: +SKIP.

AmplabJenkins · 2015-07-01T00:48:10Z

Merged build triggered.

AmplabJenkins · 2015-07-01T00:48:15Z

Merged build started.

davies · 2015-07-01T00:50:36Z

LGTM

SparkQA · 2015-07-01T00:51:15Z

Test build #36216 has started for PR 7124 at commit d68fd38.

x1- · 2015-07-01T00:52:19Z

Thank you very much ✨ 🙇 ✨

SparkQA · 2015-07-01T01:12:53Z

Test build #36216 has finished for PR 7124 at commit d68fd38.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-07-01T01:13:16Z

Merged build finished. Test PASSed.

…ataframe with no explicit column name Because implicit name of `pandas.columns` are Int, but `StructField` json expect `String`. So I think `pandas.columns` are should be convert to `String`. ### issue * [SPARK-8535 PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name](https://issues.apache.org/jira/browse/SPARK-8535) Author: x1- <viva008@gmail.com> Closes #7124 from x1-/SPARK-8535 and squashes the following commits: d68fd38 [x1-] modify unit-test using pandas. ea1897d [x1-] For implicit name of pandas.columns are Int, so should be convert to String. (cherry picked from commit b6e76ed) Signed-off-by: Davies Liu <davies@databricks.com>

davies · 2015-07-01T03:39:35Z

Merged into master, 1.3 and 1.4 branch, thanks!

For implicit name of pandas.columns are Int, so should be convert to …

ea1897d

…String.

davies reviewed Jun 30, 2015
View reviewed changes

modify unit-test using pandas.

d68fd38

asfgit closed this in b6e76ed Jul 1, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-8535][PySpark]PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name #7124

[SPARK-8535][PySpark]PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name #7124

x1- commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

JoshRosen commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

SparkQA commented Jun 30, 2015

SparkQA commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

davies Jun 30, 2015

JoshRosen Jun 30, 2015

davies Jun 30, 2015

x1- Jul 1, 2015

AmplabJenkins commented Jul 1, 2015

AmplabJenkins commented Jul 1, 2015

davies commented Jul 1, 2015

SparkQA commented Jul 1, 2015

x1- commented Jul 1, 2015

SparkQA commented Jul 1, 2015

AmplabJenkins commented Jul 1, 2015

davies commented Jul 1, 2015

[SPARK-8535][PySpark]PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name #7124

[SPARK-8535][PySpark]PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name #7124

Conversation

x1- commented Jun 30, 2015

issue

AmplabJenkins commented Jun 30, 2015

JoshRosen commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

SparkQA commented Jun 30, 2015

SparkQA commented Jun 30, 2015

AmplabJenkins commented Jun 30, 2015

davies Jun 30, 2015

Choose a reason for hiding this comment

JoshRosen Jun 30, 2015

Choose a reason for hiding this comment

davies Jun 30, 2015

Choose a reason for hiding this comment

x1- Jul 1, 2015

Choose a reason for hiding this comment

AmplabJenkins commented Jul 1, 2015

AmplabJenkins commented Jul 1, 2015

davies commented Jul 1, 2015

SparkQA commented Jul 1, 2015

x1- commented Jul 1, 2015

SparkQA commented Jul 1, 2015

AmplabJenkins commented Jul 1, 2015

davies commented Jul 1, 2015