-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.Rmd
112 lines (70 loc) · 3.39 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
title: "Project Home"
site: workflowr::wflow_site
# output:
# workflowr::wflow_html:
# toc: false
output:
bookdown::html_document2:
toc: false
editor_options:
chunk_output_type: console
markdown:
wrap: 72
---
This is the website for the research project "Frequency-Aware Similarity
Calibration".
If you have cloned the project to a local computer this website is
rendered in the `docs` subdirectory of the project directory.
If you are using `workflowr` to publish the research website it will
also be rendered online to GitHub Pages.
This page acts as a table of contents for the website. There are links
to the webpages generated from the analysis notebooks and to the
rendered versions of manuscripts/documents/presentations.
------------------------------------------------------------------------
## [Proposal](proposal.html) {.unnumbered}
This notebook explains the central ideas behind the project.
## [Notes](notes.html) {.unnumbered}
This notebook is for keeping notes of any points that may be useful for
later project or manuscript development and which are not covered in the
analysis notebooks or at risk of getting lost in the notebooks.
------------------------------------------------------------------------
# Analysis Notebooks {.unnumbered}
## 01 Read, check, and standardise the entity data {.unnumbered}
Initial data preparation of imported entity records.
### [01-1 Get, subset, check, and save data](01-1_get_data.html) {.unnumbered}
Import the raw data, cut it back to the subset of rows and columns that
are possibly useful, sanity check the data, and save the data in an
R-friendly format.
### [01-2 Check administrative variables](01-2_check_admin.html) {.unnumbered}
Check the "administrative" variables. This is data relating to the
administration of voter registration.
### [01-3 Check residence variables](01-3_check_resid.html) {.unnumbered}
Check the residence variables - residential address and phone number.
### [01-4 Check demographic variables](01-4_check_demog.html) {.unnumbered}
Check the demographic variables - sex, age, and birth place.
### [01-5 Check name variables](01-5_check_name.html) {.unnumbered}
Check the name variables.
### [01-6 Clean variables](01-6_clean_vars.html) {.unnumbered}
Clean all the variables.
------------------------------------------------------------------------
## 02 Blocking variables {.unnumbered}
Examine the distributions of potential blocking variables.
------------------------------------------------------------------------
## 03 Name frequency (equality) {.unnumbered}
Detailed examination of the distributions of name frequencies induced by
the string equality relation.
------------------------------------------------------------------------
## 04 Name frequency (similarity) {.unnumbered}
Detailed examination of the distributions of name frequencies induced by
a string similarity relation.
------------------------------------------------------------------------
## 05 Similarity calibration {.unnumbered}
Detailed examination of the calibration from similarity to probability
of identity match, both unconditionally and as a function of name
frequency.
------------------------------------------------------------------------
## 06 Compatibility models {.unnumbered}
Estimate multivariate compatibility models.
------------------------------------------------------------------------
# Manuscripts {.unnumbered}