Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to install to install Optimus #391

Closed
cpatte7372 opened this issue Nov 9, 2018 · 17 comments
Closed

Unable to install to install Optimus #391

cpatte7372 opened this issue Nov 9, 2018 · 17 comments

Comments

@cpatte7372
Copy link

Hello all,

I'm really excited about what I've read about Optimus. Unfortunately, I'm coming across a number of issues getting it up and running.

I have followed the instructions here:

https://medium.com/hi-optimus/how-to-install-jupyter-notebook-4-4-0-and-optimus-on-ubuntu-18-04-92ff5ef30ea4

However, whenever i try to issue the following script to my notebook I get the following error:

import optimus as op

ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 import optimus as op

ModuleNotFoundError: No module named 'optimus'

Im running python 3.6

I feel that this is a simple error that I have made, but I'm at a loss.

Any help will be greatly appreciated.

Cheers

@FavioVazquez
Copy link
Collaborator

Hi! Thanks for the interest. That post it’s a little outdated. You can run:

pip install optimuspyspark

And then check the examples from the master branch!

You have to do a different import now.

Please let me know if this work for you! :)

@cpatte7372
Copy link
Author

Hi FavioVazquez,

Thanks for reaching out. I'm sure I'm tried your suggestion and it didn't work out ... going to try it again now and let you know.

@cpatte7372
Copy link
Author

OK,

I'm not getting the following error from the terminal when I issue the following command after installing pip install optimuspyspark from https://github.com/ironmussa/Optimus

from optimus import Optimus
op = Optimus()

packt@ubuntu-c:/.local/lib/python3.5/site-packages/optimus$ from optimus import Optimus
from: can't read /var/mail/optimus
packt@ubuntu-c:
/.local/lib/python3.5/site-packages/optimus$ op = Optimus()
bash: syntax error near unexpected token `('
packt@ubuntu-c:~/.local/lib/python3.5/site-packages/optimus$

@cpatte7372
Copy link
Author

Sorry, I meant to say I'm NOW getting the following error from the terminal when I issue the following command after installing pip install optimuspyspark from https://github.com/ironmussa/Optimus

from optimus import Optimus
op = Optimus()

packt@ubuntu-c:/.local/lib/python3.5/site-packages/optimus$ from optimus import Optimus
from: can't read /var/mail/optimus
packt@ubuntu-c:/.local/lib/python3.5/site-packages/optimus$ op = Optimus()
bash: syntax error near unexpected token `('
packt@ubuntu-c:~/.local/lib/python3.5/site-packages/optimus$

@cpatte7372
Copy link
Author

Any thoughts?

@cpatte7372
Copy link
Author

I tried to install on another unbuntu with the command sudo pip install optimuspyspark and I got the following error:

packt@ubuntu2:~$ sudo pip install optimuspyspark
The directory '/home/packt/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/packt/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting optimuspyspark
  Downloading https://files.pythonhosted.org/packages/56/6a/93e4f3cedb245028820915173d932e2002cb6121a60c2cab6b013963cc91/optimuspyspark-2.1.3.tar.gz (82kB)
    100% |████████████████████████████████| 92kB 269kB/s 
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-ust1jm/optimuspyspark/setup.py", line 14, in <module>
        with open('requirements.txt') as f:
    IOError: [Errno 2] No such file or directory: 'requirements.txt'
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-ust1jm/optimuspyspark/
packt@ubuntu2:~$ 

@cpatte7372
Copy link
Author

Any help will be greatly appreciated

@cpatte7372
Copy link
Author

Hi, I managed to get this up and running. However, when I go through the example shown here
https://github.com/ironmussa/Optimus

I run the command df.cols.replace("product","taaaccoo","taco")\ but there isn't any change to the table

Any reasons why?

@argenisleon
Copy link
Collaborator

hi @cpatte7372
It should work, I just execute

df =op.load.url("https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/foo.csv")
df.table()

image

def func(value, arg):
    return "this was a number"
    
df\
    .rows.sort("product","desc")\
    .cols.lower(["firstName","lastName"])\
    .cols.date_transform("birth", "yyyy/MM/dd", "dd-MM-YYYY")\
    .cols.years_between("birth", "yyyy/MM/dd")\
    .cols.remove_accents("lastName")\
    .cols.remove_special_chars("lastName")\
    .cols.replace("product","taaaccoo","taco")\
    .cols.replace("product",["piza","pizzza"],"pizza")\
    .rows.drop(df["id"]<7)\
    .cols.drop("dummyCol")\
    .cols.rename(str.lower)\
    .cols.apply_by_dtypes("product",func,"string", data_type="integer")\
    .cols.trim("*")\
    .table()

image

Is this what you expect?

@cpatte7372
Copy link
Author

cpatte7372 commented Nov 11, 2018 via email

@cpatte7372
Copy link
Author

Hi argenisleon,

Any idea where I can find more cleansing examples?

Can you let me know where I can find a list of switches as shown below?
.cols.remove_accents
.cols.replace
.cols.rename

Thanks

@argenisleon
Copy link
Collaborator

@cpatte7372
About the first question:

def func(value, arg):
    return "this was a number"

Is a function applied all the integers in the 'product' column. More especicaflly in transform all integers to the string "this was a number"

.cols.apply_by_dtypes("product",func,"string", data_type="integer")
  1. You can find all the columns and row operacions here and here

You also can find a lot of examples and uses cases in the examples folder

Hope this help

@cpatte7372
Copy link
Author

@argenisleon

thanks for the information, that's great.

I'm trying to run the same commands from my Databricks workspace notebook (as you probably know Databricks is built on Spark and uses the same principle as Jupyter Notebook).

I issue the following commands:

!pip install optimuspyspark
!from optimus import Optimus
!op= Optimus()
!df =op.load.url("https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/foo.csv")
!table()

However, there isn't any output. Do you have any idea why?

Cheers

@argenisleon
Copy link
Collaborator

@FavioVazquez can you please look at this?

@FavioVazquez
Copy link
Collaborator

Hi @cpatte7372! The table is a method of the df, so you have to run

df.table()

Please let me know if this helps :)

@argenisleon
Copy link
Collaborator

@cpatte7372 Did you solve the problem?

@cpatte7372
Copy link
Author

@argenisleon this fixed the problem.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants