# Dictionary: Stata to R

The following table provides some quick translations of Stata commands into R. Since R supports multiple data sets, we need to specify a specific data set to manipulate when using data accessing/modifying commands. We use mydata as the default data set to target.

Stata R Description
cls cat("\014") -OR- cat(rep("\n",50)) Clears Stata output / R console
clear all rm(list=ls()) Clears data, value labels, etc from memory
cd "mydirectory" setwd("mydirectory") Change working directories
pwd getwd() Display the working directory
reg y x1 x2 summary(lm(y~x1+x2, data=mydata)) Ordinary least squares with constant
reg y x1 x2, nocon summary(lm(y~x1+x2-1, data=mydata)) Ordinary least squares without constant
if (x==y) {...} if (x==y) {...} Initial line condition use to evaluate whether a command(s) should be exectuted
reg y x if (x>0) lm(y~x, data=subset(mydata,x>0)) Select a conditional subset of data
forvalues i=1/100 {...} for (i in 1:100) {...} Loop through integer values of i from 1 to 100
foreach i in "a" "b" "c" {...} for (i in c("a","b","c")) {...} Loop through a list of items
di "Hello World" print("Hello World") Prints "hello world" on screen
do "mydofile.do" source("myRscript.R") Call and run code file
save "mydata.dta", replace save.image("mydata.Rdata") Save current workspace/data
di 2345^2 2345^2 Calculate 2345 squared
logit y x summary(glm(y~x,data=mydata,family="binomial")) Perform logit maximum likelihood estimation
probit y x summary(glm(y~x,data=mydata,family=binomial(link = "probit"))) Perform probit maximum likelihood estimation
sort x y mydata[order(mydata\$x, mydata\$y),] Sort the data frame by variable x
cor x y cor(x,y) Produce a table of correlates between x and y
help command 1. ?command
2. help(command)
Load the help file on a command
edit edit(mydata) Open data editor window (not recommended)
summarize summary(mydata) Provide summary values for data
table x y table(mydata\$x,mydata\$y) # 1.
ftable(y~x,data=mydata) # 2.
Two way table
hist x hist(mydata\$x) Histogram of variable x
scatter x y plot x y Scatter plot of x on y
list mydata Print to screen all of the values of the data frame
2. mydata[1:5,]
Print to screen first 5 rows of data
generate x2=x^2 mydata\$x2 <- mydata\$x^2 Create a new variable x2 which is the square of x
replace x=y1+y2 1. mydata\$x <- mydata\$y1 + mydata\$y2
2. mydata\$x <- with(mydata, y1 + y2)
Change the x value of data to be equal to y1+y2
for i=1/10 {
di `i'
}
for (i in 1:10) print(i) Print count from 1 to 10
replace x=0 if x<0 mydata\$x[mydata\$x<0] <- 0 Replace all values of x less than 0 with zero
drop if x>100 mydata <- subset(mydata,!x>100) Drop observations with x greater than 100
keep if x<100 mydata <- subset(mydata,x<100) Keep observations with x less than 100
drop x mydata\$x <- NULL Drop variable x from the data
keep x mydata <- mydata\$x Keep only x in the data
append using "mydata2.dta" mydata <- rbind(mydata, mydata2) Append mydata2 to mydata
merge 1:1 index using "mydata2.dta" merge(mydata,mydata2,index) Merge two data sets together by index variable(s)
set obs 1000
gen x=rnormal()
mydata\$x <- rnorm(1000) Generate 1000 random normal draws
set obs 1000
gen x=runiform()
mydata\$x <- runif(1000) Generate 1000 random uniform draws
set obs 1000
gen x=rbinomial(10,.1)
mydata\$x <- rbinom(1000, 10, .1) Generate 1000 random binomial (10,.1) draws
count nrow(mydata) Count the number of observations in the data
foreach v of varlist * {
rename `v' `v'old
}
names(mydata) <- paste0(names(mydata),"old") Rename all of the variables in the data ...old
rename oldvar newvar colnames(dataframe)[colnames(dataframe)=="oldvar"] <- "newvar" Rename variable.
clear
set obs 100
gen x=rnormal(100)
gen y=x*2 + rnormal(100)*5
mydata<-data.frame(x=x<-rnorm(100), y=x*2 + rnorm(100)*5) Simulate a new data set with y dependent upon x
egen id = group(x y) 1. within(mydata, {ID <- ave(ID, list(x, y), FUN=seq_along)})
2. mydata\$ID <- with(mydata, ave(ID, list(x, y), FUN=seq_along))
3. mydata\$ID <- ave(ID, list(mydata\$x, mydata\$y), FUN=seq_along)
Create an identifier ID from variables x and y

Thanks to Sebastian Kranz I have been made aware of a document RStata.pdf by Oscar Torres-Reyna which provides a similar translation.

Of course it is also worth considering purchasing Bob Muenchen's 542 page book "R for Stata Users"

##### Clone this wiki locally
You can’t perform that action at this time.