Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to execute python code from calculated field #21

Closed
Timmeah opened this issue Feb 24, 2017 · 9 comments
Closed

Unable to execute python code from calculated field #21

Timmeah opened this issue Feb 24, 2017 · 9 comments

Comments

@Timmeah
Copy link

Timmeah commented Feb 24, 2017

Hi All,

For installation I followed the steps in the github: https://github.com/tableau/TabPy/blob/master/server.md#setup-on-windows so that it is installed on the same machine as Tableau Desktop (version 10.1.5, Windows 10).

I installed the TabPy server, by downloading the zip and running the startup.bat, this in turn installs Anaconda and after a while it mentions it is running and listening on port 9004.

Next I perform the steps in Tableau desktop to connect to it: Help > Settings and Performance > External Service Connection, and add as server localhost and port 9004. I check the connection and it responds that it can communicate with the Predictive Service.

Using the blog found here: https://www.tableau.com/about/blog/2016/12/using-python-sentiment-analysis-tableau-63606 I performed a test drive. I removed all the comments to have the calculated field accept the code, and get to the following;

SCRIPT_REAL(
"
from nltk.sentiment import SentimentIntensityAnalyzer

text = _arg1
scores = []
sid = SentimentIntensityAnalyzer()

for word in text:
ss = sid.polarity_scores(word)
scores.append(ss['compound'])

return scores"
,ATTR([Word]))

When executing the calculated field, it returns;

Error processing script
ImportError : No module named sentiment

The blogpost found here nltk/nltk#1259 points to a version issue wrt NLTK therefore I upgraded conda, python and nltk accordingly, with all versions meeting requirements.

After restarting the python server it still responds with;

ERROR:main:{"info": "ImportError : No module named sentiment", "ERROR": "Error processing script"}
ERROR:tornado.access:500 POST /evaluate (::1) 35.00ms

Okay, so maybe something is still an issue related to the NLTK package, so I try steps found in another blog, where the following calculated field is defined;

SCRIPT_REAL(
"import numpy as np
return np.corrcoef(_arg1,_arg2)[0,1]",SUM([Sales]),SUM([Profit]))

this results in no errors but warnings;

C:\Users<user>\Anaconda\envs\Tableau-Python-Server\lib\site-packages\numpy\lib\function_base.py:1890: RuntimeWarning: Degrees of freedom <= 0 for slice
warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)
C:\Users<user>\Anaconda\envs\Tableau-Python-Server\lib\site-packages\numpy\lib\function_base.py:1901: RuntimeWarning: invalid value encountered in true_divide
return (dot(X, X.T.conj()) / fact).squeeze()

yet nothing happens in the accompanied workbook or views.

In some other case this script results in the output;

Error processing script
TypeError : unsupported operand type(s) for +: 'float' and 'NoneType'

Any clue why the tableau python server (seems to) refuse to handle the requests from calculated fields with python scripts in Tableau?

Kind regards,

Tim

@BBeran
Copy link
Contributor

BBeran commented Feb 25, 2017

Hi Tim,
I am not extremely familiar with the issues related to that sentiment package but here is the example I used in the announcement post for TabPy on Tableau blog (November 4th). First install the package

pip install VaderSentiment

Then in Tableau you can use the following calculated field

SCRIPT_REAL("from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
vs=[]
analyzer = SentimentIntensityAnalyzer()
for i in range(0,len(_arg1)):
a = analyzer.polarity_scores(_arg1[i])['compound']
vs.append(a)
return vs", ATTR([Comment Text]))

Please make sure all the dimensions in the view is being used as addressing (you can do this by clicking on the calculated field and selecting Edit Table Calculation and then checking all the boxes for items listed under Compute Using)

As for the correlation coefficient example the issues is most likely with the addressing settings. It looks like items are being sent to Python 1 row at a time and it is complaining that it can't compute correlation coefficient with a single point.

You can see an example of how these settings can be used in this part of the documentation with a screenshot in this section. Using Python in Tableau Calculations

I have seen two other blogs copy-paste the correlation coefficient example from here but omitting this detail about addressing settings which is confusing a lot of people.

I hope this helps.

Bora

@BBeran BBeran closed this as completed Feb 25, 2017
@Timmeah
Copy link
Author

Timmeah commented Feb 26, 2017

Dear Bora,

Thanks for your swift reply and information. I have performed the steps you suggested but to no avail. I'll give a thorough description of the steps I have taken.

First step is installing the package using "pip install VaderSentiment". This works successfully as expected.

Next, though I'm just listing this for completeness, the python code needed some fixing;
The lines of the "for loop" needed to be indented and the name of the field "CommentText" in the workbook is used without space between the words resulting in;

SCRIPT_REAL("
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

vs=[]
analyzer = SentimentIntensityAnalyzer()
for i in range(0,len(_arg1)):
    a = analyzer.polarity_scores(_arg1[i])['compound']
    vs.append(a)
return vs
", ATTR([CommentText]))

Next I tried connecting with the Python Tableau Server as described before and this results in success. However when I try to execute the code from a calculated field it responds with;

ERROR:tornado.access:500 POST /evaluate (::1) 23.00ms
ImportError : No module named vaderSentiment.vaderSentiment
ERROR:__main__:{"info": "ImportError : No module named vaderSentiment.vaderSentiment", "ERROR": "Error processing script"}
ERROR:tornado.access:500 POST /evaluate (::1) 16.00ms

Again this must be due to package version issue as described before. Therefore I verify the versions of all the packages in Python;

C:\Users\<user>\Anaconda\envs\Tableau-Python-Server>conda update nltk
Fetching package metadata ...........
Solving package specifications: .
# All requested packages already installed.
# packages in environment at C:\Users\<user>\Anaconda:
nltk                      3.2.2                    py27_0

C:\Users\<user>\Anaconda\envs\Tableau-Python-Server>conda update conda
Fetching package metadata ...........
Solving package specifications: .
# All requested packages already installed.
# packages in environment at C:\Users\<user>\Anaconda:
conda                     4.3.13                   py27_0

C:\Users\<user>\Anaconda\envs\Tableau-Python-Server>conda update python
Fetching package metadata ...........
Solving package specifications: .
# All requested packages already installed.
# packages in environment at C:\Users\<user>\Anaconda:
python                    2.7.13                        0
C:\Users\<user>\Anaconda\envs\Tableau-Python-Server>

The strange thing is that when I verify the version of the nltk package in the python interactive shell it reports the following;

C:\Users\<user>\Anaconda\envs\Tableau-Python-Server>python
Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, May 28 2015, 16:44:52) [MSC v.1500 64 bit (AMD64)] on win32
>>> import nltk
>>> print(nltk.__version__)
3.0.3

Therefore I decided to do a manual install and create a new environment. After I setup this new environment I first made sure all the components were up to date again, given the versions above. Afterwards I have two environments;

C:\Users\<user>>conda info --envs
# conda environments:
#
Tableau-Python-Server     C:\Users\<user>\Anaconda\envs\Tableau-Python-Server
Tableau-Python-Server-Env     C:\Users\<user>\Anaconda\envs\Tableau-Python-Server-Env

Activate the new environment;

C:\Users\<user>\Anaconda\envs\Tableau-Python-Server-Env>activate Tableau-Python-Server-Env

And install the tabpy-server;

(Tableau-Python-Server-Env) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server-Env>pip install tabpy-server

and run it;

(Tableau-Python-Server-Env) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server>startup.bat
Initializing TabPy...
Done initializing TabPy.
Web service listening on port 9004

ERROR:tornado.access:500 POST /evaluate (::1) 23.00ms
ImportError : No module named vaderSentiment.vaderSentiment
ERROR:__main__:{"info": "ImportError : No module named vaderSentiment.vaderSentiment", "ERROR": "Error processing script"}
ERROR:tornado.access:500 POST /evaluate (::1) 16.00ms

Admittedly not an equal comparison as I don't have Tableau and the TabPy connector running on it (yet), but the following 2 steps immediately work on a linux distro;

tim@tim:~/pip install VaderSentiment
Collecting VaderSentiment
  Downloading vaderSentiment-2.5-py2.py3-none-any.whl (102kB)
    100% |████████████████████████████████| 112kB 1.4MB/s 
Installing collected packages: VaderSentiment
Successfully installed VaderSentiment-2.5

tim@tim:~/<path_to>/TabPy$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
>>>

Admittedly I'm going a bit on a hunch here, but it seems newer versions of packages are not picked up by the Anaconda / Python environment and therefore the imports still fail when Tableau tries to run the python code through the connector..

Any help is appreciated!

Kind regards,

Tim

@BBeran
Copy link
Contributor

BBeran commented Feb 28, 2017

Hi Tim,
TabPy should automatically pick up the packages without need to restarting the server. Could you do me a favor and check if vaderSentiment is installed in the Tableau-Python-Server environment? I think the issue might be that tabpy and sentiment package are installed in different Python environments.

Does this folder exist?

C:\Users<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\vaderSentiment

if not can you try to activate Tableau-Python-Server environment first and then install vaderSentiment?

Thank you,

Bora

@Timmeah
Copy link
Author

Timmeah commented Mar 6, 2017

Hi Beran,

Thank you for your reply. I've followed your steps;

C:\Users\<user>>activate Tableau-Python-Server

(Tableau-Python-Server) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages>pip install VaderSentiment
Collecting VaderSentiment
  Using cached vaderSentiment-2.5-py2.py3-none-any.whl
Installing collected packages: VaderSentiment
Successfully installed VaderSentiment-2.5

(Tableau-Python-Server) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages> dir
...
03/06/2017  02:47 PM    <DIR>          vaderSentiment
03/06/2017  02:47 PM    <DIR>          vaderSentiment-2.5.dist-info
...

The package is there, in the Tableau-Python-Server, and I start it using;

(Tableau-Python-Server) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server>startup.bat

I test the connection between Tableau and the TabPy server, this returns;

Successfully connected to the Predictive Service.

The issues still persist, I verified that the Python code in the calculated field uses the vaderSentiment.vaderSentiment import, resulting in the following error messages;

(Tableau-Python-Server) C:\Users\<user>\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server>startup.bat Initializing TabPy... Done initializing TabPy. Web service listening on port 9004 ImportError : No module named sentiment ERROR:__main__:{"info": "ImportError : No module named sentiment", "ERROR": "Error processing script"} ERROR:tornado.access:500 POST /evaluate (::1) 11943.00ms ImportError : No module named sentiment ERROR:__main__:{"info": "ImportError : No module named sentiment", "ERROR": "Error processing script"} ERROR:tornado.access:500 POST /evaluate (::1) 81.00ms TypeError : unsupported operand type(s) for +: 'float' and 'NoneType' ERROR:__main__:{"info": "TypeError : unsupported operand type(s) for +: 'float' and 'NoneType'", "ERROR": "Error processing script"} ERROR:tornado.access:500 POST /evaluate (::1) 59.00ms TypeError : unsupported operand type(s) for +: 'float' and 'NoneType' ERROR:__main__:{"info": "TypeError : unsupported operand type(s) for +: 'float' and 'NoneType'", "ERROR": "Error processing script"} ERROR:tornado.access:500 POST /evaluate (::1) 4.00ms ERROR:__main__:{"endpoint_name": "DiagnosticsDemo", "ERROR": "UnknownURI"} Endpoint 'DiagnosticsDemo' does not exist ERROR:__main__:{"info": "Endpoint 'DiagnosticsDemo' does not exist", "ERROR": "UnknownURI"} The endpoint you're trying to query did not respond. Please make sure the endpoint exists and the correct set of arguments are provided. ERROR:__main__:{"info": "The endpoint you're trying to query did not respond. Please make sure the endpoint exists and the correct set of arguments are provided.", "ERROR": "Error processing script"}

Error processing script The endpoint you're trying to query did not respond. Please make sure the endpoint exists and the correct set of arguments are provided.
The last messages are from the workbooks that are provided to test TabPy that use the sentiment package.

I've also tried a correlation coefficients workbook and the UnsupervisedModel packaged workbook, that use numpy and sklearn packages respectively, and these don't result in error.

To eliminate possible issue might I ask for a packaged workbook that contains data and a python based calculated field using the vaderSentiment package, that is known to work with TabPy, allowing me to verify that the issue is not in the workbook but somewhere else?

Thanks in advance.

Kind regards,

Tim

@BBeran
Copy link
Contributor

BBeran commented Mar 7, 2017

Hi Tim,
Errors are a bit confusing. You should be loading

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

the error says no module named sentiment which implies that your import directive is trying to load something called sentiment.

SCRIPT_REAL("
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
vs=[]
analyzer = SentimentIntensityAnalyzer()
for i in range(0,len(_arg1)):
    a = analyzer.polarity_scores(_arg1[i])['compound']
    vs.append(a)
return vs
", ATTR([CommentText]))

And the second error is from the DiagnosticsDemo sample workbook which requires deploying the model first and deployment steps are provided as a Jupyter workbook in the blog post.

https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916

@srikanth-komarkalva
Copy link

Timmeah:
I got the same error and I got it resolved. This occurs because there was no sentiment subpackage in NTLK package. Also you will need to download vader-lexicon pack as part of NTLK.

Follow the steps below:

  1. Try updating the ntlk package by giving the below command:
    pip install --upgrade ntlk

  2. Go to Python console and give below commands:
    cmd> Python
    Import ntlk.download()

You will see a pop-up screen here asking to select the pack to download. Select vader-lexicon.zip

Voila !! You are done.

Let me know if it works.

Srikanth

@markwu2000
Copy link

I had the same error "ImportError : No module named sentiment". What happened for me is that the package was installed in the different Python env. The issue is resolved by copy the vaderSentiment package from other env to /Users/<>user/anaconda/envs/Tableau-Python-Server/Lib/python2.7/site-packages/

@wguicheney
Copy link

Hi Bora,

I'd first like to thank you for all your work on TabPy and for being so active with the community!

I've run into the same issue as Timmeah regarding the correlation calculation (exactly the same error message, stating there is an unsupported operand type(s) for +: 'float' and 'NoneType'

You'll notice I've made sure my Tablea Calc is set to compute using Country, the only dimension in the view, so I'm not sure where to go from here!

capture

@Proteusiq
Copy link

Does your inputs contain Null values?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants