Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uploading scripts for review #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

uploading scripts for review #5

wants to merge 1 commit into from

Conversation

stao1
Copy link

@stao1 stao1 commented Mar 17, 2018

No description provided.

bit$X....Preliminary <- NULL

# replace "." with "NA"
bit$Average.Permit.Price[bit$Average.Permit.Price=="."] <- NA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to do this for the whole data frame, here's an alternative: df[df=="."] <- NA https://stackoverflow.com/questions/19503266/replace-all-particular-values-in-a-data-frame

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, to make code more readable, it's generally good to put spaces around = or == (which I didn't do, oops!) http://style.tidyverse.org/


# remove commas
bit$Resident.Interim.Permits.Issued <- gsub(",", "", bit$Resident.Interim.Permits.Issued)
bit$Resident.Interim.Permits.Issued <- gsub(",", "", bit$Resident.Interim.Permits.Issued)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's one example alternative method:

df <- data.frame(x = c("a", "1,x", "b", "c"),
        y = c("b,b", "a,", "b", "f"),
        z = c("a", "a", "g,,fhj", "a"),
        stringsAsFactors = FALSE)

sapply(df, function(col){gsub(",", "", col)})

This way returns your data frame as a matrix, so it's worth checking to make sure it doesn't mess anything else up. There are also tidyverse ways of doing things like this, but I'll leave that for you to explore :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool


# write and validate EML
write_eml(eml, eml_path)
eml_validate(eml_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally better to eml_validate before you write_eml just so you don't overwrite your file with bad eml! So that'd be:

eml_validate(eml)
write_eml(eml, eml_path)

a <- read.csv('/home/stao/my-sasap/114_commercial_crew/Commercial Crew data 2012-2016.csv',
header = T,
stringsAsFactors = F,
na.strings = c("", "Not Available"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm it looks like na.strings solved the problem you had to deal with in the other file ("." --> NA)! Cool, I didn't know about this argument!


# correct typos
typo <- which(a$Full.Name == ",ARL A. HIXSON")
a$Full.Name[typo] <- "CARL A. HIXSON"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, I think it's also possible to do all this in one line (without which):
a$Full.Name[a$Full.Name == ",ARL A. HIXSON"] <- "CARL A. HIXSON"

It does get a little bit harder to read though, so you could go either way.

eml@dataset@intellectualRights <- new('intellectualRights',
.Data = "CFEC retains intellectual property rights to data collected by or for CFEC. Any dissemination of the data must credit CFEC as the source, with a disclaimer that exonerates the department for errors or deficiencies in reproduction, subsequent analysis, or interpretation. Please see http://www.adfg.alaska.gov/index.cfm?adfg=home.copyright for further information.")

# change abstract
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip: you can create "sections" in RStudio if you add ------ or ##### to your comments. See here for more info: https://support.rstudio.com/hc/en-us/articles/200484568-Code-Folding-and-Sections

@dmullen17
Copy link
Member

@stao1 if you want to make some of these changes you can update your pull request with these instructions: https://github.com/NCEAS/data-processing#making-changes-to-your-contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants