-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trouble using aggregate function in R #639
Comments
Just want to make sure I understand what you're wanting, you're wanting average peak area by protein, right? So the mean of each row? If so, you can use the rowMeans() function, passing the n x (n-1) dimension data frame, without the protein labels.
Off the cuff it would look something like proteinAverages <- rowMeans(proteinAreas[, 2:ncols(proteinAreas), na.rm= TRUE)
I'll look in to why aggregate wasn't working once I get my laptop out.
… On Jun 13, 2017, at 12:46 AM, Yaamini Venkataraman ***@***.***> wrote:
I have a spreadsheet with protein names and associated peak areas, but each protein transition has it's own row:
I want to average the peak area for each transition, so I have one row in my spreadsheet per protein. I'm using the aggregate function in R to average peak areas for each protein. I'm using na.rm and na.action to handle all of the N/A values in my spreadsheet, based on this stack overflow thread.
averageProteinAreas <- aggregate(proteinAreas[-1], proteinAreas[1], mean, na.action = na.omit, na.rm = TRUE)
However, I just get a spreadsheet filled with N/As and the warning message "1: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA"
Not really sure how to fix this/what the error message is actually saying. Any help would be much appreciated!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Ah, I looked in to it a little further and just realized it's because your
NA's are being interpreted as strings. R expects NAs to look like "NA" not
"#N/A" as you have them in your sheet.
You can read them in with something like
proteinAreas <-
read_csv("~/Documents/RobertsLab/work/2017-06-10-protein-areas-only-error-checked.csv",
na = "#N/A")
and it looks like your aggregate command works fine.
…On Tue, Jun 13, 2017 at 5:44 AM, Sean Bennett ***@***.***> wrote:
Just want to make sure I understand what you're wanting, you're wanting
average peak area by protein, right? So the mean of each row? If so, you
can use the rowMeans() function, passing the n x (n-1) dimension data
frame, without the protein labels.
Off the cuff it would look something like proteinAverages <-
rowMeans(proteinAreas[, 2:ncols(proteinAreas), na.rm= TRUE)
I'll look in to why aggregate wasn't working once I get my laptop out.
On Jun 13, 2017, at 12:46 AM, Yaamini Venkataraman <
***@***.***> wrote:
I have a spreadsheet
<http://owl.fish.washington.edu/spartina/DNR_Skyline_20170524/2017-06-10-protein-areas-only-error-checked.csv>
with protein names and associated peak areas, but each protein transition
has it's own row:
[image: screen shot 2017-06-13 at 12 42 17 am]
<https://user-images.githubusercontent.com/22335838/27071313-3adc834c-4fd1-11e7-9f2e-52103e638944.png>
I want to average the peak area for each transition, so I have one row in
my spreadsheet per protein. I'm using the aggregate function in R to
average peak areas for each protein. I'm using na.rm and na.action to
handle all of the N/A values in my spreadsheet, based on this stack
overflow thread
<https://stackoverflow.com/questions/17737174/blend-of-na-omit-and-na-pass-using-aggregate-in-r>
.
averageProteinAreas <- aggregate(proteinAreas[-1], proteinAreas[1], mean, na.action = na.omit, na.rm = TRUE)
However, I just get a spreadsheet filled with N/As and the warning message
"1: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA"
[image: screen shot 2017-06-13 at 12 45 21 am]
<https://user-images.githubusercontent.com/22335838/27071391-9d789ed2-4fd1-11e7-8b61-5db1d7146cb9.png>
Not really sure how to fix this/what the error message is actually saying.
Any help would be much appreciated!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#639>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKIJeuownjnvYMcFUjAIpOlOCQ2gDecwks5sDj5ngaJpZM4N4G7Q>
.
|
ahh whoops. thanks @seanb80! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have a spreadsheet with protein names and associated peak areas, but each protein transition has it's own row:
I want to average the peak area for each transition, so I have one row in my spreadsheet per protein. I'm using the
aggregate
function in R to average peak areas for each protein. I'm usingna.rm
andna.action
to handle all of the N/A values in my spreadsheet, based on this stack overflow thread.However, I just get a spreadsheet filled with N/As and the warning message "1: In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA"
Not really sure how to fix this/what the error message is actually saying. Any help would be much appreciated!
The text was updated successfully, but these errors were encountered: