Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skipmissing(x) not defined #1302

Closed
Datseris opened this issue Dec 2, 2017 · 7 comments
Closed

skipmissing(x) not defined #1302

Datseris opened this issue Dec 2, 2017 · 7 comments

Comments

@Datseris
Copy link
Contributor

Datseris commented Dec 2, 2017

I load a csv file with missing data using CSV, DataFrames; CSV.read(myfile) and the the missing data are represented using the word #NULL instead of missing, which is NOT the same as in the documentation page. (first red flag there)

Then, I try to use the function skipmissing, but it is undefined: UndefVarError: skipmissing not defined . I have also tried DataFrames.skipmissing, CSV.skipmissing.

I though that the stable documentation accompanies the version installed by Pkg.add and Pkg.update but this does not seem to be the case here?

In the meantime, can somebody tell me how I can drop NULL values from my data, because it is kind of super-crucial that I do it as fast as possible?

@nalimilan
Copy link
Member

Can you post the versions of DataFrames, CSV, and Missings you are using? It looks like you don't have the latest version of Missings.

@Datseris
Copy link
Contributor Author

Datseris commented Dec 2, 2017

I got - DataFrames 0.10.1....
Why isn't Pkg.update() updating it?

(edit, sorry if my tone is agressive, I am kind in tremendous pressure now since the documentation and what happens in my computer are two completely different things)

@nalimilan
Copy link
Member

Because many dependencies are incompatible with it. See https://discourse.julialang.org/t/dataframes-0-11-released/7296/.

@Datseris
Copy link
Contributor Author

Datseris commented Dec 2, 2017

I am looking at: #1232

If you can help me identify which packages I need to "remove" you are welcome...

40 required packages:
 - Atom                          0.6.5
 - BenchmarkTools                0.2.2
 - CSV                           0.1.5
 - Combinatorics                 0.5.0
 - DataFrames                    0.10.1
 - DataTables                    0.0.3
 - DiffEqBase                    2.6.1
 - Distributions                 0.15.0
 - Documenter                    0.12.3
 - ForwardDiff                   0.7.0
 - IJulia                        1.6.2
 - Interact                      0.6.3
 - Interpolations                0.7.2
 - IterTools                     0.1.0
 - JLD                           0.8.3
 - KernelDensity                 0.4.0
 - MIDI                          0.2.0+             master
 - Measurements                  0.5.0+             master
 - NLsolve                       0.13.0
 - NearestNeighbors              0.3.0
 - ODE                           0.7.0
 - OhMyREPL                      0.2.10
 - OrdinaryDiffEq                2.30.0
 - ParameterizedFunctions        2.3.0
 - PkgDev                        0.1.6
 - Plots                         0.13.1
 - ProfileView                   0.3.0
 - ProgressMeter                 0.5.2
 - PyCall                        1.15.0
 - PyPlot                        2.3.2
 - Reactive                      0.6.0
 - RecursiveArrayTools           0.12.4
 - RegionTrees                   0.1.0
 - Requires                      0.4.3
 - StaticArrays                  0.6.6
 - SymPy                         0.5.4
 - TaylorIntegration             0.2.0
 - TaylorSeries                  0.7.0
 - TypeSortedCollections         0.2.0
 - Unrolled                      0.0.1

On the other hand, I am looking at the documentation for version 0.10.1. But the problem is that the null values in my data right now are represented as :

149-element NullableArrays.NullableArray{Int64,1}:
 4    
 #NULL
 #NULL
 #NULL
 #NULL
 #NULL
 #NULL

a[2]
Nullable{Int64}()

which means that dropna doesn't do anything. Oh man this is really stressing :(

@nalimilan
Copy link
Member

I'd avoid using CSV.jl with DataFrames 0.10.1. If you have a deadline, better keep using readtable to read CSV files, and work with DataArrays.

@Datseris
Copy link
Contributor Author

Datseris commented Dec 2, 2017

Thank you thank you thank you with readtable dropna works!

@nalimilan
Copy link
Member

Cool, closing then. Try moving to version 0.11 when you have more time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants