# by

In today's session we'll use [transplants.dta](transplants.dta) to demonstrate a few useful commands that we will use quite frequently.

```Stata

use transplants, clear

```

What if we want a variable of the total number of records in each ABO blood type?

```Stata

bys abo: gen cat_n = _N
```

So what did this code achieve?

```Stata

tab cat_n
```

Let's add labels to the groups:

```Stata
#delimit ;
lab define abo_lab
   1 "A"
   2 "B"
   3 "AB"
   4 "O"
;
#delimit cr

lab values abo abo_lab
```

Now lets tabulate once again:

```Stata

tab abo
```
Nice!

Ok. Do you remember these commands?

```Stata
di c(N)

di c(k)
```

Lets create a `disturbance` to our setup to learn a few additional commands and their value to us:

```Stata

qui do https://raw.githubusercontent.com/jhustata/book/main/sample-keep.ado
```

Is that a new command? Or are you already familiar with it? How about this:

```Stata
samplekeep
```

What just happened?

```Stata
di c(N)

di c(k)
```

Let's rerun earlier commands to restore our sanity!

```Stata
tab abo
```

Ok, then. It was a mere temporary disturbance and peace was `restored`!! We will talk more about the `preserve` and `restore` commands, often used together. 

Back to the `by` command:

```Stata
bys abo: egen age_byabo = mean(age)
```

Any idea what this command just achieved?

```Stata
codebook age_byabo
```

What do you notice? Only four unique values! The `egen` command, like the `gen` command is used to define new variables in the dataset. However, the `gen` command applies values at the level of the individual. The `egen` command does so `by` group and yields summary statistics. To learn more type `h egen`.

```Stata
qui regress age i.abo
local abo_H: di %3.2f e(r2_a)*100

di "ABO blood group may influence survivorship on a kidney transplant waitlist. Blood type O are universal dononrs whereas type AB are universal recipients. Hypothetically, these two represent the extremes and types A and B should be in between. But in this cross-sectional dataset blood group explains only `abo_H'% in age. Only a longitudinal data structure would be in position to investigate survivorship"
```

Enough. I don't wish to trouble you with a bogus hypothesis but merely wish to explore return values from a regression!

```Stata
lincom _b[3.abo]
```

What blood group has value 3?

```Stata
return list
di %3.2f r(p)
di %3.2f r(estimate)
```

After a regression we may type `ereturn list` for all sorts of estimated values:

```Stata
matrix define b = e(b)
matrix list b
di b[1,3]
```



