Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-647070: Installing Pandas and Snowflake separately #443

Closed
kamipatel opened this issue Aug 17, 2022 · 1 comment
Closed

SNOW-647070: Installing Pandas and Snowflake separately #443

kamipatel opened this issue Aug 17, 2022 · 1 comment
Labels
bug Something isn't working needs triage Initial RCA is required

Comments

@kamipatel
Copy link

kamipatel commented Aug 17, 2022

I installed snowpark and pandas separately. this code work in mac. However when I run the code in Amazon Linux 2 it fails.

import os
from tokenize import String
import pandas as pd
import json
from io import StringIO
from datetime import datetime
from botocore.exceptions import ClientError
from snowflake.snowpark import Session
from snowflake.snowpark.types import StructType, StructField, StringType, IntegerType
from snowflake.connector.pandas_tools import write_pandas
from snowflake.snowpark.functions import when_matched, when_not_matched
from snowflake.snowpark.functions import col, lit

connection_parameters = {
}

def write_to_snowflake(cdf):
session = Session.builder.configs(connection_parameters).create()
df=session.create_dataframe(cdf)
df.write.mode("overwrite").save_as_table("cdp_staging", table_type="temporary")
print("after stage")
session.close()

def stage():
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data)
write_to_snowflake(df)

stage()

At run time I get an error
"create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame."

Does it need specific version of pandas? Thanks!

What version of Python are you using?
3.8

What operating system and processor architecture are you using?

Linux-4.14.287-215.504.amzn2.x86_64-x86_64-with-glibc2.2.5

What are the component versions in the environment (pip freeze)?
asn1crypto==1.5.1
boto3==1.24.53
botocore==1.27.53
certifi==2022.6.15
cffi==1.15.1
charset-normalizer==2.1.0
cloudpickle==2.0.0
cryptography==36.0.2
idna==3.3
jmespath==1.0.1
numpy==1.23.2
oscrypto==1.3.0
pandas==1.4.3
pycparser==2.21
pycryptodomex==3.15.0
PyJWT==2.4.0
pyOpenSSL==22.0.0
python-dateutil==2.8.2
pytz==2022.2.1
requests==2.28.1
s3transfer==0.6.0
six==1.16.0
snowflake-connector-python==2.7.11
snowflake-snowpark-python==0.8.0
typing_extensions==4.3.0
urllib3==1.26.11

What did you do?
Open your AWS Cloud9 Amazon EC2 environment.
Install Python 3.8 and pip3 by running the following commands:
$ sudo amazon-linux-extras install python3.8
$ curl -O https://bootstrap.pypa.io/get-pip.py
$ python3.8 get-pip.py --user
Create a python folder by running the following command:
python3.8 -m pip install snowflake-snowpark-python -t python/ --upgrade
python3.8 -m pip install pandas -t python/ --upgrade

What did you expect to see?
Pandas and snowpark to work on Linux 2 as I need to run this code in AWS Lambda

Can you set logging to DEBUG and collect the logs?
Response
{
"errorMessage": "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame.",
"errorType": "TypeError",
"stackTrace": [
" File "/var/lang/lib/python3.8/imp.py", line 234, in load_module\n return load_source(name, filename, file)\n",
" File "/var/lang/lib/python3.8/imp.py", line 171, in load_source\n module = _load(spec)\n",
" File "", line 702, in _load\n",
" File "", line 671, in _load_unlocked\n",
" File "", line 843, in exec_module\n",
" File "", line 219, in _call_with_frames_removed\n",
" File "/var/task/lambda_function.py", line 81, in \n stage()\n",
" File "/var/task/lambda_function.py", line 62, in stage\n write_to_snowflake(df)\n",
" File "/var/task/lambda_function.py", line 45, in write_to_snowflake\n df=session.create_dataframe(cdf)\n",
" File "/opt/python/snowflake/snowpark/session.py", line 1133, in create_dataframe\n raise TypeError(\n"
]
}

@kamipatel kamipatel added bug Something isn't working needs triage Initial RCA is required labels Aug 17, 2022
@github-actions github-actions bot changed the title Installing Pandas and Snowflake separately SNOW-647070: Installing Pandas and Snowflake separately Aug 17, 2022
@kamipatel
Copy link
Author

I found the issue. It need pyarrow separately. All good. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Initial RCA is required
Projects
None yet
Development

No branches or pull requests

1 participant