Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in rbind(deparse. level, ...) #19

Closed
bartling opened this issue Jan 15, 2015 · 23 comments
Closed

Error in rbind(deparse. level, ...) #19

bartling opened this issue Jan 15, 2015 · 23 comments
Assignees
Labels
Milestone

Comments

@bartling
Copy link

I encountered the following error using read.socrata

> data <- read.socrata("https://data.cityofchicago.org/resource/kn9c-c2s2.json")
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match
In addition: Warning message:
'guess_media' is deprecated.
Use 'mime::guess_type' instead.
See help("Deprecated") 
@tomschenkjr
Copy link
Contributor

Thank you for the full explanation behind the error. We have provided a patch on this, but is not yet on CRAN. You can download the fix at [https://github.com/Chicago/RSocrata/releases/tag/v1.5.1]. It can be installed with:

install.packages("/path/to/RSocrata-1.5.1.tar.gz", repos=NULL, type='source')

This issue is related (but explained more clearly) to issue #7

@bartling
Copy link
Author

Thanks, Tom. I'm actually having the same issue after installing 1.5.1

> install.packages("RSocrata-1.5.1.tar.gz", repos=NULL, type='source')
Installing package into/home/hugh/R/i686-pc-linux-gnu-library/3.1’
(aslibis unspecified)
Warning in untar2(tarfile, files, list, exdir, restore_times) :
  skipping pax global extended headers
* installing *source* packageRSocrata...
** R
** preparing package for lazy loading
** help
No man pages found in packageRSocrata*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (RSocrata)
> library(RSocrata)
> social <- read.socrata("https://data.cityofchicago.org/resource/kn9c-c2s2.json") 
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match
In addition: Warning message:
'guess_media' is deprecated.
Use 'mime::guess_type' instead.
See help("Deprecated") 

@kent37
Copy link
Contributor

kent37 commented Feb 22, 2015

I'm seeing this issue in the github version. The problem is that the data from Socrata is missing fields. If you go to https://data.cityofchicago.org/Health-Human-Services/Census-Data-Selected-socioeconomic-indicators-in-C/kn9c-c2s2 and scroll to the bottom, the Community Area Number (a numeric field) contains text, and the HARDSHIP INDEX is blank. In the JSON response these fields are missing which causes the rbind() to fail.

I reported this to Socrata as https://support.socrata.com/hc/requests/7131

@tomschenkjr
Copy link
Contributor

Yes, it's possible that a dataset may contain incompatible data types. This is being deprecated with the new version of the API, but it will be awhile before that is entirely resolved in existing datasets.

I've tested with 1.6.0 build 7 on master branch (82a1375) and it seems to be able to download that dataset. You can use it now and the next version on CRAN will support it (ETA: March)

@kent37
Copy link
Contributor

kent37 commented Feb 23, 2015

I am still seeing this error. I did

> devtools::install_github('Chicago/RSocrata')
> data <- read.socrata("https://data.cityofchicago.org/resource/kn9c-c2s2.json")
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match

The DESCRIPTION of the installed RSocrata says

GithubRepo: RSocrata
GithubUsername: Chicago
GithubRef: master
GithubSHA1: 82a13753271ee7face09f8b6762b0d4430a28089

so I seem to have the correct version...

Is there a specific changeset that is supposed to fix this?

@tomschenkjr
Copy link
Contributor

Ah, this error will occur when using the .json format. It turns out the error still exists when using the kn9c-c2s2.json extension, but, it works for read.socrata("https://data.cityofchicago.org/resource/kn9c-c2s2.csv") and read.socrata("https://data.cityofchicago.org/id/kn9c-c2s2").

Will look at this and resolve. I'll remove the duplicate tag and prioritize accordingly. In the meantime, if you run across this error, please use the alternative formulations.

Thanks for digging into this.

@kent37
Copy link
Contributor

kent37 commented Feb 24, 2015

Yes, error is in the .json format. I am using the .csv. I hope you can fix the .json as well.

@geneorama
Copy link
Member

This has been fixed in the sprint7 branch, and I added an informal test; R/tests/Uneven_JSON_issue_19.R

tomschenkjr added a commit that referenced this issue Oct 6, 2016
@PriyaDoIT PriyaDoIT reopened this Oct 17, 2016
@PriyaDoIT
Copy link

@nicklucius to add a test for downloading uneven columns to support why we need rbind.fill

@nicklucius
Copy link
Contributor

nicklucius commented Oct 17, 2016

I think I know why the tests are not failing when rbind.fill() in read.socrata() is commented out and rbind() is used.

The "missing fields issue" that we see in some rows of JSON files can wreak havoc at two points:

POINT 1. When combining 1000 or fewer rows within a single page (fromJSON() is used here)
POINT 2. When combining the pages together (this is the only time rbind.fill() is used)

Before #102 was merged, the failure point for all the examples we had was POINT 1. There are no examples I know of that would test POINT 2. If we use rbind() instead of rbind.fill(), the function would fail if the dataset had NAs in any column for the first 1000 rows of data. When the first page is processed and converted into a data frame, that column with the missing data would not be part of the data frame. Then at POINT 2, rbind() would encounter data frames with different numbers of columns.

So I don't think we can test for this until we come across a dataset with a column of all NAs in the first 1000 rows.

@geneorama let me know what you think.

@geneorama
Copy link
Member

geneorama commented Oct 18, 2016

So I don't think we can test for this until we come across a dataset with a column of all NAs in the first 1000 rows.

Here's an example where you can find all NAs in the first 1000, 25000, or 50000 rows:
https://data.smgov.net/resource/ia9m-wspt.json?&$where=incident_date<'2012-01-01'

@nicklucius
Copy link
Contributor

nicklucius commented Oct 19, 2016

Thanks @geneorama! I made a test using that URL, which fails with rbind() and passes with rbind.fill().

@Jagoul
Copy link

Jagoul commented Dec 23, 2016

can you please provide us with final steps on how to install the related packages that would allow us to bypass this error

@nicklucius
Copy link
Contributor

Hi @Jagoul. We have uploaded the fix to CRAN, so the current version of RSocrata on CRAN should no longer have this bug. Try installing RSocrata again and see if that works. If you are still having an issue, let us know what you're trying to run and we can take a look at it.

@Jagoul
Copy link

Jagoul commented Dec 23, 2016

install.packages("/path/to/RSocrata-1.5.1.tar.gz", repos=NULL, type='source') , this is the line of code that i am trying to run. could you please try to solve this error for me , i am trying to download a csv file from a website online and it gives me the following error :
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match

installing the package again gave me the following error :
Installing package into ‘/home/jagoul/R/i686-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Warning: invalid package ‘/path/to/RSocrata-1.5.1.tar.gz’
Error: ERROR: no packages specified
Warning in install.packages :
installation of package ‘/path/to/RSocrata-1.5.1.tar.gz’ had non-zero exit status

@nicklucius
Copy link
Contributor

@Jagoul- RSocrata is now in version 1.7.1, and it looks like you might be running 1.5.1.

You can install the new version with install.packages('RSocrata')

@Jagoul
Copy link

Jagoul commented Dec 23, 2016

Hey nicklucius, the package is downloaded now but i still have the following error :

Retrieving administrative levels...
Converting values to a tabular format...
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match

@nicklucius
Copy link
Contributor

@Jagoul- Could you copy and paste the read.socrata() line that you are trying to use? And if you could type sessionInfo() and copy the output, that could help as well.

@Jagoul
Copy link

Jagoul commented Dec 23, 2016

i am not using the library now in my code, but i want to get rid of the error mentioned above so i thought the using of Rscorata would help binding the datasets.

sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] httr_1.1.0 activityinfo_0.4.20

loaded via a namespace (and not attached):
[1] rjson_0.2.15 R6_2.1.2 rsconnect_0.4.3 tools_3.2.5 curl_0.9.7

@nicklucius
Copy link
Contributor

@Jagoul- Unfortunately if you're getting that error and you're not using RSocrata then you are having a problem unrelated to RSocrata.

@Momut1
Copy link

Momut1 commented Aug 13, 2018

I am getting the same error message when using reshape for a simple dataframe melt. Can anyone shed some light on this? Thank you!

@geneorama
Copy link
Member

@Momut1 this exact error message is a common error message encountered with reshape and melt. Unless your error is specifically related to RSocrata operation, I would recommend searching for the error on StackOverflow and finding a solution that way.

If you have a new question, StackOverflow would be a more appropriate venue for getting support / help on that.

@ThomasPepperz
Copy link

The real problem may the fact you have blank columns that the file still thinks are there. I had 20 data files, mostly with the same column names. Some of them were r-binding and others weren't. The column names that were extra on the end were blank except for the names. I simply highlighted them and pressed backspace, thinking I deleted the columns and thereby making all 20 files the same. However, I kept receiving the same error you all have received. I opened up all of the files, high-lighted all of the rows in excel, and right-clicked and selected delete rows. I shut all of the files and attempted to rbind all twenty and it started working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants