Trouble using aggregate function in R #639

yaaminiv · 2017-06-13T07:46:46Z

I have a spreadsheet with protein names and associated peak areas, but each protein transition has it's own row:

I want to average the peak area for each transition, so I have one row in my spreadsheet per protein. I'm using the aggregate function in R to average peak areas for each protein. I'm using na.rm and na.action to handle all of the N/A values in my spreadsheet, based on this stack overflow thread.

averageProteinAreas <- aggregate(proteinAreas[-1], proteinAreas[1], mean, na.action = na.omit, na.rm = TRUE)

However, I just get a spreadsheet filled with N/As and the warning message "1: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA"

Not really sure how to fix this/what the error message is actually saying. Any help would be much appreciated!

The text was updated successfully, but these errors were encountered:

seanb80 · 2017-06-13T12:48:11Z

Just want to make sure I understand what you're wanting, you're wanting average peak area by protein, right? So the mean of each row? If so, you can use the rowMeans() function, passing the n x (n-1) dimension data frame, without the protein labels. Off the cuff it would look something like proteinAverages <- rowMeans(proteinAreas[, 2:ncols(proteinAreas), na.rm= TRUE) I'll look in to why aggregate wasn't working once I get my laptop out.

…

On Jun 13, 2017, at 12:46 AM, Yaamini Venkataraman ***@***.***> wrote: I have a spreadsheet with protein names and associated peak areas, but each protein transition has it's own row: I want to average the peak area for each transition, so I have one row in my spreadsheet per protein. I'm using the aggregate function in R to average peak areas for each protein. I'm using na.rm and na.action to handle all of the N/A values in my spreadsheet, based on this stack overflow thread. averageProteinAreas <- aggregate(proteinAreas[-1], proteinAreas[1], mean, na.action = na.omit, na.rm = TRUE) However, I just get a spreadsheet filled with N/As and the warning message "1: In mean.default(X[[i]], ...) : argument is not numeric or logical: returning NA" Not really sure how to fix this/what the error message is actually saying. Any help would be much appreciated! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

seanb80 · 2017-06-13T15:56:58Z

Ah, I looked in to it a little further and just realized it's because your NA's are being interpreted as strings. R expects NAs to look like "NA" not "#N/A" as you have them in your sheet. You can read them in with something like proteinAreas <- read_csv("~/Documents/RobertsLab/work/2017-06-10-protein-areas-only-error-checked.csv", na = "#N/A") and it looks like your aggregate command works fine.

…

On Tue, Jun 13, 2017 at 5:44 AM, Sean Bennett ***@***.***> wrote: Just want to make sure I understand what you're wanting, you're wanting average peak area by protein, right? So the mean of each row? If so, you can use the rowMeans() function, passing the n x (n-1) dimension data frame, without the protein labels. Off the cuff it would look something like proteinAverages <- rowMeans(proteinAreas[, 2:ncols(proteinAreas), na.rm= TRUE) I'll look in to why aggregate wasn't working once I get my laptop out. On Jun 13, 2017, at 12:46 AM, Yaamini Venkataraman < ***@***.***> wrote: I have a spreadsheet <http://owl.fish.washington.edu/spartina/DNR_Skyline_20170524/2017-06-10-protein-areas-only-error-checked.csv> with protein names and associated peak areas, but each protein transition has it's own row: [image: screen shot 2017-06-13 at 12 42 17 am] <https://user-images.githubusercontent.com/22335838/27071313-3adc834c-4fd1-11e7-9f2e-52103e638944.png> I want to average the peak area for each transition, so I have one row in my spreadsheet per protein. I'm using the aggregate function in R to average peak areas for each protein. I'm using na.rm and na.action to handle all of the N/A values in my spreadsheet, based on this stack overflow thread <https://stackoverflow.com/questions/17737174/blend-of-na-omit-and-na-pass-using-aggregate-in-r> . averageProteinAreas <- aggregate(proteinAreas[-1], proteinAreas[1], mean, na.action = na.omit, na.rm = TRUE) However, I just get a spreadsheet filled with N/As and the warning message "1: In mean.default(X[[i]], ...) : argument is not numeric or logical: returning NA" [image: screen shot 2017-06-13 at 12 45 21 am] <https://user-images.githubusercontent.com/22335838/27071391-9d789ed2-4fd1-11e7-8b61-5db1d7146cb9.png> Not really sure how to fix this/what the error message is actually saying. Any help would be much appreciated! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#639>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKIJeuownjnvYMcFUjAIpOlOCQ2gDecwks5sDj5ngaJpZM4N4G7Q> .

yaaminiv · 2017-06-14T01:15:53Z

ahh whoops. thanks @seanb80!

yaaminiv closed this as completed Jun 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble using aggregate function in R #639

Trouble using aggregate function in R #639

yaaminiv commented Jun 13, 2017

seanb80 commented Jun 13, 2017 via email

seanb80 commented Jun 13, 2017 via email

yaaminiv commented Jun 14, 2017

Trouble using aggregate function in R #639

Trouble using aggregate function in R #639

Comments

yaaminiv commented Jun 13, 2017

seanb80 commented Jun 13, 2017 via email

seanb80 commented Jun 13, 2017 via email

yaaminiv commented Jun 14, 2017