Skip to content

Commit

Permalink
from surf
Browse files Browse the repository at this point in the history
  • Loading branch information
xiaodaigh committed Mar 4, 2019
1 parent 47d1d6e commit f2677a1
Show file tree
Hide file tree
Showing 10 changed files with 28 additions and 236 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -33,3 +33,4 @@ tmp_*/
model.rds
scorecard.rds
first_harp_date.df
.ipynb_checkpoints
2 changes: 2 additions & 0 deletions inst/surf_2019_02_demo/.gitignore
@@ -1,3 +1,5 @@
.ipynb_checkpoints
surf_201902.nb.html
surf_201902.html
01_surf_201902.html
01_surf_201902.tex
25 changes: 24 additions & 1 deletion inst/surf_2019_02_demo/01_surf_201902.rmd
Expand Up @@ -3,6 +3,7 @@ title: "Simple Fannie Mae Example"
output:
html_document:
df_print: paged
pdf_document: default
---

```{r}
Expand Down Expand Up @@ -53,7 +54,9 @@ select
count(*) as cnt,
n_default/cnt as odr # observed default rate
from
table;
table
group by
monthly.rpt.prd;
```
This analysis only uses two columns, namely `default_12m` and `monthly.rpt.prd`. So I use `srckeep` to ensure that only those two columns are loaded.

Expand All @@ -78,6 +81,26 @@ system.time(a_wh2 <- a_wh1 %>% collect) # 60~70 plugged in
a_wh2
```

```{r}
a_wh1 %>%
srckeep(c("monthly.rpt.prd", "default_12m")) %>%
map(function(chunk) {
chunk[1,]
}) %>%
collect
```


```{r}
a_wh1 %>%
srckeep(c("monthly.rpt.prd", "default_12m")) %>%
map(~{
.x[1,.SD]
}) %>%
collect
```


once `collect` is called the resultant data is stored as a data.frame.

However this is not the correct result, as the group by was performed within each chunk. Hence we need a second stage group by. The second group by takes no time at all, as everything was done in memory
Expand Down
47 changes: 0 additions & 47 deletions inst/surf_2019_02_demo/_archive/01a_read_from_csv.r

This file was deleted.

19 changes: 0 additions & 19 deletions inst/surf_2019_02_demo/_archive/01d_a_rbind_all_data_together.r

This file was deleted.

10 changes: 0 additions & 10 deletions inst/surf_2019_02_demo/_archive/01d_b_OPTIONAL_rechunk.r

This file was deleted.

76 changes: 0 additions & 76 deletions inst/surf_2019_02_demo/_archive/02a_create_forward_looking_flag.r

This file was deleted.

This file was deleted.

34 changes: 0 additions & 34 deletions inst/surf_2019_02_demo/_archive/02d_plot_odr.r

This file was deleted.

1 change: 1 addition & 0 deletions presentation/.gitignore
@@ -1 +1,2 @@
*.pptx
*.pdf

0 comments on commit f2677a1

Please sign in to comment.