Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

id should be factor, not a character #3

Open
qgeissmann opened this issue Aug 22, 2017 · 19 comments
Open

id should be factor, not a character #3

qgeissmann opened this issue Aug 22, 2017 · 19 comments

Comments

@qgeissmann
Copy link
Contributor

same issue as rethomics/scopr#2

@pepelisu
Copy link
Contributor

pepelisu commented Jan 25, 2018

It's not specified in the tutorial that this should be character, if it is loaded in the csv as integer the load_dam2 complains.

Error in `[.data.table`(q, , .(regions = list(region_id)), by = c("path",  : 
  column or expression 1 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]

@estebanbeck
Copy link

Hi, a collaborator is having the same problem. But I don't get how to go around it. How do I need to save the monitor file? This is probably because the pass-through excel to process (crop) the data, right?

@qgeissmann
Copy link
Contributor Author

hi, can she/he send the code they are using/a sample file? cheers

@estebanbeck
Copy link

With the sample files, it works perfectly. It doesn't work with their files, but they look really similar!

@NicoleStephens96
Copy link

Could somebody please explain why this error occurs, and how to correct the problem. The first few times using Rethomics, I did not receive this error, but now I am even with the files that originally did not get the error

@qgeissmann
Copy link
Contributor Author

qgeissmann commented Nov 26, 2018

hi @NicoleStephens96, I probably have time to look into it this week. Can you send as much detail as possible regarding your code and the errors you get (i.e. what line of code raises the error). Also, If you can send me your metadata file it could help me a lot :), thanks :)

@NicoleStephens96
Copy link

NicoleStephens96 commented Nov 26, 2018 via email

@qgeissmann
Copy link
Contributor Author

hey after which line of code do you get the error, also which practice file/tutorial are you using :)?
thanks

@NicoleStephens96
Copy link

NicoleStephens96 commented Nov 26, 2018 via email

@pepelisu
Copy link
Contributor

I think the error is coming from excel to csv conversion. When metadata is created in excel and the column region_id is interpreted as a integer (number) then in the csv is saved as a number. However rethomics function load_dam is expecting a character in that column. To solve the problem:

  1. open the csv in excel
  2. change the column format from number or general to text.
  3. save the csv again as with csv format, selecting option of text separated by commas.

To permanently solve this issue I would recommend to check the type of the column and if it is an integer transform it to character (if needed).

@qgeissmann
Copy link
Contributor Author

qgeissmann commented Nov 30, 2018

Thanks @pepelisu!
To be honest I am struggling a bit with this thread. I cannot reproduce the bug (yet0. I don't think the region_id type is the issue. region_id is expected to be an integer all along (e.g. it is an integer in all the tests). In fact, linking will not work if region_id is a character. Therefore I would be surprised if this worked... For me it looks more like a data.table issue... @NicoleStephens96 does @pepelisu's trick solve anything for you?

@qgeissmann
Copy link
Contributor Author

qgeissmann commented Nov 30, 2018

My understanding is that the path to the file that is generated during the linking is a list in your platform and a character in mine.
@NicoleStephens96, it would help a lot if you could run this for me:

# the normal linking of the tutorial metadata
metadata <- link_dam_metadata(metadata, result_dir = DATA_DIR)
metadata[, sapply(file_info, function(x) x$path)]

and

str(metadata)

and paste the results for both :)

@NicoleStephens96
Copy link

I figured out the problem last night on my end and maybe this can help anyone else getting the same error:)

I was looking for differences between the data I got working and data I could not get working. What I found was, for some reason, extra rows were added to the metadata that contained blanks and N/A.


94: Monitor6.txt 2018-11-10 00:00:00 2018-11-16 00:00:00 6 30 M Sal 51
95: Monitor6.txt 2018-11-10 00:00:00 2018-11-16 00:00:00 6 31 M Sal 53
96: Monitor6.txt 2018-11-10 00:00:00 2018-11-16 00:00:00 6 32 M Sal 58
97: NA NA
98: NA NA
99: NA NA
file start_datetime stop_datetime machine_id region_id Sex Genotype

I simply deleted the extra rows using the code below and was able to link and load the data without a problem.

metadata<- metadata[-c(97,98,99)]
metadata

I appreciate everybody's help!

@qgeissmann
Copy link
Contributor Author

thanks, that really helps! (also you can use metadata <- na.omit(metadata)). So were they just empty rows, or just missing values?

@NicoleStephens96
Copy link

NicoleStephens96 commented Nov 30, 2018 via email

@qgeissmann
Copy link
Contributor Author

so excel seems to keep gosh rows that are completely empty. I will add a check to remove all empty rows from metadata

@qgeissmann
Copy link
Contributor Author

@NicoleStephens96 can you upload an excel-generated csv with empty rows in the end for me -- so I can see how they are exactly? Thanks

qgeissmann pushed a commit that referenced this issue Feb 1, 2019
@jtengjia
Copy link

jtengjia commented Jun 15, 2022

In my case, the problem is that the metadata has two exactly same lines. In normal condition the result should be like this using the tutorial data:
image
but when my metadata contains repeated lines it shows like this:
image
so my solution code is like this:
metadata_final <- fread(paste0(DATA_DIR_final,"/metadata.csv")) %>% unique()
use 'unique()' function to remove duplicated line. and then the error disappears. Hope this could help someone getting the same error.

My understanding is that the path to the file that is generated during the linking is a list in your platform and a character in mine. @NicoleStephens96, it would help a lot if you could run this for me:

# the normal linking of the tutorial metadata
metadata <- link_dam_metadata(metadata, result_dir = DATA_DIR)
metadata[, sapply(file_info, function(x) x$path)]

and

str(metadata)

and paste the results for both :)

@ronjafrigard
Copy link

The %>% unique () command worked for me! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants