glossary.yml

ACSPRI: |
  Australian Consortium for Social and Political Research Inc. oganizes conferences and delivers course. (<a href="https://www.acspri.org.au/">ACSPRI</a>)
Adjusted-count-R2: |
  Adjusted count R-squared is a measure of model fit for binary logistic regression that adjusts the percent correctly predicted (or count R-squared) by the model for the number of people in the largest outcome category. (SwR, Glossary)
Adjusted-R2: |
  Adjusted R-squared is a measure of model fit for ordinary least squares linear regression that penalizes the R-squared, or percentage of variance explained, for the number of variables in the model (SwR, Glossary)
Advanced Organizer: |
  Advance organizers are pedagogic devices that bridge the gap between what learners already know and what learners need to know. (<a href="https://link.springer.com/referenceworkentry/10.1007/978-1-4419-1428-6_157">Encyclopedia of the Science of Learning</a>)
AIC: |
  Akaike’s information criterion (AIC) compares the quality of a set of statistical models to each other. The AIC will take each model and rank them from best to worst, but it won’t say anything about absolute quality. The basic formula is defined as AIC = -2(log-likelihood) + 2K. (<a href="https://www.statisticshowto.com/akaikes-information-criterion/"Statistics How-To</a>)
Alpha: |
  Alpha is the threshold for the upper limit for statistical significance set prior to analyses that limits the probability of a Type I error; an alpha of .05 would result in p-values below .05 being considered statistically significant. (SwR, Glossary)
Alternate Hypothesis: |
  An alternate hypothesis (HA or sometimes written as H1) is a claim that there is a difference or relationship among things; the alternate hypothesis is paired with the null hypothesis that typcially states there is no relationship or no difference between things. (SwR, Glossary)
amfAR: |
  amfAR, the Foundation for AIDS Research, known until 2005 as the American Foundation for AIDS Research, is an international nonprofit organization dedicated to the support of AIDS research, HIV prevention, treatment education, and the advocacy of AIDS-related public policy. (<a href="https://en.wikipedia.org/wiki/AmfAR">Wikipedia</a>)
Anderson-Darling: |
  The Anderson-Darling Goodness of Fit Test (AD-Test) is a measure of how well your data fits a specified distribution. It’s commonly used as a test for normality. (<a href="https://www.statisticshowto.com/anderson-darling-test/">Statistics How-To</a>)
ANOVA: |
  Analysis of variance is a statistical method used to compare means across groups to determine whether there is a statistically significant difference among the means; typically used when there are three or more means to compare. (SwR, Glossary)
APIx: |
  An API, or application programming interface, is a set of defined rules that enable different applications to communicate with each other. It acts as an intermediary layer that processes data transfers between systems, letting companies open their application data and functionality to external third-party developers, business partners, and internal departments within their companies. (<a href="https://www.ibm.com/topics/api">IBM</a>)
Arcsine Transformations: |
  Arcsine transformation are data transformation techniques often recommended to normalize percent or proportion data; the arcsine transformation uses the inverse of the sine function and the square root of the variable to transform. (SwR, Glossary)
Arithmetic Mean: |
  The arithmetic mean, also known as "arithmetic average", is the sum of the values divided by the number of values. If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean $\overline{x}$ (pronounced x-bar) to distinguish it from the mean, or expected value, of the underlying distribution, the population mean $\mu$ (pronounced /'mjuː/). (<a href="https://en.wikipedia.org/wiki/Mean">Wikipedia</a>)
ATF Agency: |
  Bureau of Alcohol, Tobacco, Firearms and Explosives (ATF): ATF’s responsibilities include the investigation and prevention of federal offenses involving the unlawful use, manufacture, and possession of firearms and explosives; acts of arson and bombings; and illegal trafficking of alcohol and tobacco products. The ATF also regulates, via licensing, the sale, possession, and transportation of firearms, ammunition, and explosives in interstate commerce. (<a href="https://www.atf.gov/about/what-we-do">ATF</a>)
Backtick: |
  The backtick ` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent. (<a href="https://en.wikipedia.org/wiki/Backtick">Wikipedia</a>)
Bar Charts: |
  Bar charts are visual displays of data often used to examine similarities and differences across categories of things; bars can represent frequencies, percentages, means, or other statistics. (SwR, Glossary)
Basis Function: |
  B-Splines invent a series of entirely new, synthetic predictor variables. Each of these synthetic variables exists only to gradually turn a specific parameter on and off within a specific range of the real predictor variable. Each of the synthetic variables is called a basis function. (Chap.4)
Bayes Factor: |
  The Bayes factor is a ratio of two competing statistical models represented by their evidence, and is used to quantify the support for one model over the other. (<a href="https://en.wikipedia.org/wiki/Bayes_factor">Wikipedia</a>)
Bayesian Reasoning: |
  Bayesian reasoning is the formal process that we use to update our beliefs about the world once we’ve observed some data.
Bayesian Updating: |
  A Bayesian model begins with one set of plausibilities assigned to each of these possibilities. These are the prior plausibilities. Then it updates them in light of the data, to produce the posterior plausibilities. This updating process is a kind of learning. (Chap.2)
Bayesian Statistics: |
  Also known as evidential probability, is the process of adding prior probability to a hypothesis and adjusting that probability as new information becomes available. Unlike traditional frequentist probability which only accounts for the previous frequency of an event to predicate and outcome, the Bayesian model begins with an initial set of subjective assumptions (prior probability) and adjusts them accordingly through trial and experimentation (posterior probability). Instead of only rejecting or failing to reject a null hypothesis, Bayesian probability allows someone to quantify how much confidence they should have in a particular result. (<a href"https://deepai.org/machine-learning-glossary-and-terms/bayesian-probability">deepai.org</a>)
Bayes’ Theorem: |
  This is the theorem that gives Bayesian data analysis its name. But the theorem itself is a trivial implication of probability theory. The mathematical definition of the posterior distribution arises from Bayes' Theorem. The key lesson is that the posterior is proportional to the product of the prior and the probability of the data. (Chap.2)
Bessel’s Correction: |
  Bessel's correction is the use of n − 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample. This method corrects the bias in the estimation of the population variance. It also partially corrects the bias in the estimation of the population standard deviation. However, the correction often increases the mean squared error in these estimations. This technique is named after Friedrich Bessel. (<a href="https://en.wikipedia.org/wiki/Bessel%27s_correction">Wikipedia</a>)
Beta Distribution: |
  It is a family of continuous probability distributions defined on the interval [0, 1] in terms of two positive parameters, denoted by alpha (α) and beta (β) that control the shape of the distribution. (<a href="https://en.wikipedia.org/wiki/Beta_distribution">Wikipedia</a>) The Beta distribution is representing a probability distribution of probabilities (<a href="https://stats.stackexchange.com/a/47782/207389">stats.stackexchange</a>) You use the beta distribution to estimate the probability of an event for which you’ve already observed a number of trials and the number of successful outcomes. For example, you would use it to estimate the probability of flipping a heads when so far you have observed 100 tosses of a coin and 40 of those were heads. (BF, Chap.5)
BIC: |
  The Bayesian Information Criterion (BIC) is an index used in Bayesian statistics to choose between two or more alternative models. It is also known as the Schwarz information criterion (abrv. SIC) or the Schwarz-Bayesian information criteria. It is defined as: k log(n)- 2log(L(θ̂)). (<a/>href="https://www.statisticshowto.com/bayesian-information-criterion/">Statistics How-To</a>)
Binomials: |
  Binomials are algebraic expressions that have two terms. For example, \[2x + 1\] and \[-4y^2 + 3y\] are both binomials. ([Khan Academy](https://www.khanacademy.org/math/algebra/x2f8bb11595b61c86:quadratics-multiplying-factoring/x2f8bb11595b61c86:factor-quadratics-strategy/a/quadratics-multiplying-factoring-faq))
Binomial Coefficient: |
  The binomial coefficient is the number of ways of picking *k* unordered outcomes from *n* possibilities, also known as a combination or combinatorial number. (<a href="https://mathworld.wolfram.com/BinomialCoefficient.html>Wolfram Mathoworld</a>)
Binomial Distribution: |
  It is used to calculate the probability of a certain number of successful outcomes, given a number of trials and the probability of the successful outcome. The “bi” in the term binomial refers to the two possible outcomes: an event happening and an event not happening. (BF, Chap.4)
BHE: |
  Piketty, T. (2022). A Brief History of Equality (S. Rendall, Trans.). Harvard University Press.
Bonferroni: |
  Bonferroni post hoc test is a pairwise test used after a statistically significant ANOVA that conducts a t-test for each pair of means but adjusts the threshold for statistical significance to ensure that there is a small enough risk of Type I error; it is generally considered a very conservative post hoc test that only identifies the largest differences between means as statistically significant. (SwR, Glossary)
Boxplots: |
  Boxplots are a visual representation of data that shows central tendency (usually the median) and spread (usually the interquartile range) of a numeric variable for one or more groups; boxplots are often used to compare the distribution of a continuous variable across several groups. (SwR, Glossary)
Breusch-Pagan: |
  Breusch-Pagan is a statistical test for determining whether variance is constant, which is used to test the assumption of homoscedasticity; Breusch-Pagan relies on the [chi-squared] distribution and is used during assumption checking for [homoscedasticity] in [linear regression]. (SwR, Glossary)
BRFSS: |
  The Behavioral Risk Factor Surveillance System (BRFSS) is a system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services. Established in 1984 with 15 states, BRFSS now collects data in all 50 states as well as the District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world. (<a href="https://www.cdc.gov/brfss/index.html">BRFSS</a>)
Brown-Forsythe: |
  Brown-Forsythe is an alternate F-statistic used for analysis of variance when the assumption of homogeneity of variance is not met; the Brown-Forsythe F-statistic is computed after transforming the values of the outcome to represent the distance from the median. (SwR, Glossary)
brms: |
  brms stands for Bayesina Regression Models using Stan. brms provides an interface to fit Bayesian generalized (non-)linear multivariate multilevel models using Stan. The formula syntax is very similar to that of the package lme4 to provide a familiar and simple interface for performing regression analyses. (<a href="https://paul-buerkner.github.io/brms/">brms website</a>)
BUGS: |
  The BUGS (Bayesian inference Using Gibbs Sampling) project is concerned with flexible software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods. (<a href="https://www.mrc-bsu.cam.ac.uk/software/bugs/">The BUGS Project</a>) It is -- together with some newer derivates like <a href="https://www.mrc-bsu.cam.ac.uk/software/bugs/openbugs/">OpenBUGS</a> and <a href="https://www.multibugs.org/">MultiBUGS</a> -- currently not under active development. Use JAGS, Stan platform or NIMBLE.
B-Spline: |
  The term "spline" refers to a wide class of basis functions that are used in applications requiring data interpolation and/or smoothing. Splines are special function defined piecewise by polynomials, it results to a smooth function built out of smaller, component functions. In interpolating problems, spline interpolation is often preferred to polynomial interpolation because it yields similar results, even when using low degree polynomials, while avoiding big numbers and the problem of oscillation at the edges of intervals of higher degrees. The term "spline" comes from the flexible [spline](https://en.wikipedia.org/wiki/Flat_spline) devices used by shipbuilders and draftsmen to draw smooth shapes. They used a long, thin piece of wood or metal that could be anchored in a few places in order to aid drawing curves.
  There are many types of splines, especially the common-place 'B-splines': The 'B' stands for 'basis,' which just means 'component.' B-splines build up wiggly functions from simpler less-wiggly components. Those components are called basis functions. B-splines force you to make a number of choices that other types of splines automate. (Chap.4)
Capital Income:
  Capital income is defined as the sum of property income, including dividends, interest gains, reinvested earnings on foreign direct investment, investment income disbursements and land rents ([Distributional National Accounts (DINA) for Austria, 2004-2016](https://wid.world/document/distributional-national-accounts-dina-for-austria-2004-2016-world-inequality-lab-wp-2020-23/)).
CE/AD: |
  CE is an abbreviation for Common Era. It means the same as AD (Anno Domini) and represents the time from year 1 and onward. (<a href="https://www.timeanddate.com/calendar/ce-bce-what-do-they-mean.html">timeanddate</a>)
CDC: |
  Centers for Disease Control and Preventation (CDC) is the U.S. leading science-based, data-driven, service organization that protects the public’s health. (<a href="https://www.cdc.gov/about/">CDC</a>)
CDF: |
  A cumulative distribution function (CDF) tells us the probability that a random variable takes on a value less than or equal to x. (<a href="https://www.statology.org/cdf-vs-pdf/">Statoloy</a>) It sums all parts of the distribution, replacing a lot of calculus work. The CDF takes in a value and returns the probability of getting that value or lower. (BF, Chap.13) A CDF is a hypothetical model of a distribution, the ECDF models empirical (i.e. observed) data. (<a href="https://www.statisticshowto.com/empirical-distribution-function/">Statistics How To</a>)
Ceiling: |
  A ceiling effect happens when many observations are at the highest possible value for a variable. (SwR, Glossary)
Centering: |
  Substracting the means leads to a lack of covariance among the parameters. In centering, you are changing the values but not the scale.  So a predictor that is centered at the mean has new values–the entire scale has shifted so that the mean now has a value of 0, but one unit is still one unit.  The intercept will change, but the regression coefficient for that variable will not.  Since the regression coefficient is interpreted as the effect on the mean of Y for each one unit difference in X, it doesn’t change when X is centered.([The Analysis Factor](https://www.theanalysisfactor.com/centering-and-standardizing-predictors/)) (SR Chap.4)
Central Limit Theorem: |
  A foundational idea in inferential statistics that shows the mean of a sampling distribution of a variable will be a close approximation to the mean of the variable in the population, regardless of whether the variable is normally distributed. The Central Limit Theorem demonstrates why samples can be used to infer information about the population. (SwR, Glossary)
Chi-squared: |
  Chi-squared is the test statistic following the chi-squared probability distribution; the chi-squared test statistic is used in inferential tests, including examining the association between two categorical variables and determining statistical significance for a logistic regression model. (SwR, Glossary)
Cisgender: |
  People whose gender identity matches their biological sex. Contrast: Transgender (SwR, Chapter 2)
Cochran’s Q-test: |
  Cochran’s Q-test is an alternative to the chi-squared test of independence for when observations are not independent; for example, comparing groups before and after an intervention would fail the independent observations assumption (SwR, Glossary)
Coefficient: |
  In mathematics, a coefficient is a multiplicative factor involved in some term. It may be a number (= numerical factor) or it may be a constant with units of measurement (= constant multiplier). (<a href="https://en.wikipedia.org/wiki/Coefficient">Wikipedia</a>)
Cohen’s d: |
  Cohen’s d is a standardized effect size for measuring the difference between two group means. It is frequently used to compare a treatment to a control group. It can be a suitable effect size to include with t-test and ANOVA results. (<a href= "https://statisticsbyjim.com/basics/cohens-d/">Statistics by Jim</a>)
Color blindness: |
  Color blindness (also spelled colour blindness) or color vision deficiency (CVD) or  includes a wide range of causes and conditions and is actually quite complex. It's a condition characterized by an inability or difficulty in perceiving and differentiating certain colors due to abnormalities in the three color-sensing pigments of the cones in the retina. (<a href="https://enchroma.com/pages/types-of-color-blindness">EnChroma</a>)
Combination: |
  The number of possible arrangements in a collection of items where the order (in contrast to [permutation]) does not matter (<a href="https://corporatefinanceinstitute.com/resources/data-science/combination/" target="_blank">Coporate Finance Institute, CFI</(a>))
Combinatorics: |
  Combinatorics is an area of mathematics primarily concerned with counting. (BS, Chap.2) One of the basic problems of combinatorics is to determine the number of possible configurations of a given type. (<a href="https://www.britannica.com/science/combinatorics">Britannica</a>)
Compatibility Interval: |
  Two parameter values that contain between them a specified amount of posterior probability, a probability mass, is usually known as confidence interval that may instead be called a credible interval. We’re going to call it a compatibility interval instead, in order to avoid the unwarranted implications of "confidence"" and "credibility." What the interval indicates is a range of parameter values compatible with the model and data. The model and data themselves may not inspire confidence, in which case the interval will not either. (SR2, Chap.3)
Conditional Probability: |
  In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. <a href="https://en.wikipedia.org/wiki/Conditional_probability">Wikipedia</a>. The mathematical notation uses the pipe symbol ('|') for "conditional on" or "given that".
Confidence Interval: |
  A range of values, calculated from the sample observations, that is believed, with a particular probability, to contain the true parameter value. (Cambridge Dictionary of Statistics, 4th ed., p.98)
Conjugate Prior: |
  If the [posterior distribution] is in the same probability distribution family as the prior probability distribution the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function. A conjugate prior is an algebraic convenience; otherwise, numerical integration may be necessary. (<a href="https://en.wikipedia.org/wiki/Conjugate_prior">Wikipedia</a>)
Contour Plot: |
  Contour plots are a way to show a three-dimensional surface on a two-dimensional plane. A contour plot is appropriate if you want to see how some value Z changes as a function of two inputs, X and Y: `z = f(x, y)`. Contour lines indicate levels that are the same. ([Statistics How To](https://www.statisticshowto.com/contour-plots/))
Contrasts: |
  Contrasts are sets of numbers used in planned contrasts to specify which means or groups of means to compare to each other, usually to identify statistically significant differences among means after a statistically significant analysis of variance. (SwR, Glossary)
Cook’s D: |
  Cook’s distance (often abbreviated Cook’s D) is used in Regression Analysis to find influential outliers in a set of predictor variables. IIt is a way to identify points that negatively affect the regression model. ([Statistics How To](https://www.statisticshowto.com/cooks-distance/))
Correlation: |
  Correlation coefficients are a standardized measure of how two variables are related, or co-vary. They are used to measure how strong a relationship is between two variables. There are several types of correlation coefficient, but the most popular is Pearson’s. Pearson’s correlation (also called Pearson’s R) is a correlation coefficient commonly used in linear regression. ([Statistics How To](https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-formula/))
Count-R-squared: |
  Count R-squared, or percent correctly predicted, is a measure of model fit for a logistic regression model that is the proportion of the outcome values that were correctly predicted out of all observations modeled. (SwR, Glossary)
Covariance cov: |
  Covariance is a measure of how much two random variables vary together. It’s similar to variance, but where variance tells you how a single variable varies, co variance tells you how two variables vary together. ([Statistics How To](https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/covariance/))
Cramér’s V: |
  Cramér’s V is an effect size to determine the strength of the relationship between two categorical variables; often reported with the results of a chi-squared. (SwR, Glossary)
CRAN: |
  Comprehensive R Archive Network (CRAN) is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R. Please use the CRAN mirror nearest to you to minimize network load. [cran.r-project.org](https://cran.r-project.org/)
Credible Interval: |
  In Bayesian statistics, a credible interval is an interval within which an unobserved parameter value falls with a particular probability. It is an interval in the domain of a posterior probability distribution or a predictive distribution. The generalisation to multivariate problems is the credible region. Credible intervals are a Bayesian analog to confidence intervals in frequentist statistics. (a href="https://en.wikipedia.org/wiki/Credible_interval">Wikipedia</a>)
Cross Validated: |
  Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. It is built and run by you as part of the Stack Exchange network of Q&A sites. (<a href="https://stats.stackexchange.com/tour">StackExchange</a>)
CSS: |
  Cascading Style Sheets (CSS) is a style sheet language used for specifying the presentation and styling of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone technology of the World Wide Web, alongside HTML and JavaScript. CSS is designed to enable the separation of content and presentation, including layout, colors, and fonts. (<a href="https://en.wikipedia.org/wiki/CSS">Wikipedia</a>)
CSV: |
    Text files where the values are separated with commas (Comma Separated Values = CSV). These files have the file extension .csv
Cumulative Distribution Function: |
  A cumulative distribution function (CDF) tells us the probability that a random variable takes on a value less than or equal to x. (<a href="https://www.statology.org/cdf-vs-pdf/">Statology</a>) It sums all parts of the distribution, replacing a lot of calculus work. The CDF takes in a value and returns the probability of getting that value or lower. (BF, Chap.13) A CDF is a hypothetical model of a distribution, the ECDF models empirical (i.e. observed) data. (<a href="https://www.statisticshowto.com/empirical-distribution-function/">Statistics How To</a>)
CVD: |
  Color vision deficiency (CVD) or color blindness (also spelled colour blindness) includes a wide range of causes and conditions and is actually quite complex. It's a condition characterized by an inability or difficulty in perceiving and differentiating certain colors due to abnormalities in the three color-sensing pigments of the cones in the retina. (<a href="https://enchroma.com/pages/types-of-color-blindness">EnChroma</a>)
Data Tables: |
   Data Tables are --- in contrast to Display Tables --- for analysis and not for presentation. Another purpose of Data Tables is to store, retrieve and share information digitally. In R they are tibbles, data.frames etc.
Data Visualization: |
  Using visual tools to examine or communicate about characteristics of data is called Data Visualization: Graphs are visual displays of data. (SwR, Glossary)
Degrees of Freedom: |
  Degree of Freedom (df) is the number of pieces of information that are allowed to vary in computing a statistic before the remaining pieces of information are known; degrees of freedom are often used as parameters for distributions (e.g., chi-squared, F). (SwR, Glossary)
Density Plots: |
  Density plots are used for examining the distribution of a variable measured along a continuum; density plots are similar to histograms but are smoothed and may not show existing gaps in data (SwR, Glossary)
Determination: |
  Coefficient of determination is the percentage of variance in one variable that is accounted for by another variable or by a group of variables; often referred to as R-squared and used to determine model fit for linear models. (SwR, Glossary)
Deterministic: |
  A deterministic equation, or model, has one precise value for y for each value of x. (SwR, Chap09)
Deviance: |
  Deviance is in logistic regression, finding the differences between observed values (0s and 1s) and the predicted values (percentages), squaring each difference, and adding up all the squared differences from each person in the data set; the deviance measures how well the model fits the data. (SwR, Glossary)
Deviation Scores: |
  Deviation scores show the difference from some value like the mean; for example, z-scores show the deviation from the mean in standard deviation units
dfbeta: |
  dfbeta are effects on coefficients of deleting each observation in turn. ("Companion to Applied Regression", car package, help file)
dfbetas: |
  dfbetas are effect on coefficients of deleting each observation in turn, standardized by a deleted estimate of the coefficient standard error ("Companion to Applied Regression", car package, help file)
Diagnostics: |
  Diagnostics in linear and logistic regression are a set of tests to identify outliers and influential values among the observations. (SwR, Glossary)
Digital Divide: |
  The term "digital divide" refers to the gap between individuals, households, businesses and geographic areas at different socio-economic levels with regard to their opportunities to access information and communication technologies (ICTs). (<a href= "https://www.oecd-ilibrary.org/science-and-technology/understanding-the-digital-divide_236405667766">OECD Library</a>)
DINAs: |
  Distributional national accounts (DINAs) aim to gather information on the distribution of the net national income and to explore it over time. ([Distributional National Accounts (DINA) for Austria, 2004-2016](https://wid.world/document/distributional-national-accounts-dina-for-austria-2004-2016-world-inequality-lab-wp-2020-23/))
Dispersion Matrix: |
  Also known as covariance matrix, variance-covariance matrix or variance matrix. It tells us how each parameter relates to every other parameter. It is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances (i.e., the covariance of each element with itself). (Wikipedia(https://en.wikipedia.org/wiki/Covariance_matrix))
Display Tables: |
  Display Tables are in constrast to Data Tables. You would find them in a web page, a journal article, or in a magazine. Such tables are for presentation, often to facilitate the construction of “Table 1”, i.e., baseline characteristics table commonly found in research papers. (<a href= "https://gt.rstudio.com/articles/gt.html">Introduction to Creating gt Tables</a>)
Donut Charts: |
  Donut or doughnut charts (sometimes also called ring charts) are an alternative chart for pie charts, which have a hole in the middle, making them cleaner to read than pie charts. (<a href="https://r-charts.com/part-whole/donut-chart/">R Charts</a>)
Dummy Data: |
  Simulated data are called dummy data to indicate that it is a stand-in for actual data. (SR2, p.62)
Dunn: |
  Dunn’s post hoc test is a pairwise comparisons to determine which groups are statistically significantly different from one another following a significant [Kruskal-Wallis] test. (SwR, Glossary)
Durbin-Watson: |
  Durbin-Watson test is a statistical test that is used to check the assumption of independent residuals in linear regression; a Durbin-Watson statistic of 2 indicates that the residuals are independent. (SwR, Glossary)
ExDA: |
  Explorative Data Analysis is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. (<a href="https://en.wikipedia.org/wiki/Exploratory_data_analysis">Wikipedia</a>)
ECDF: |
  In statistics, an empirical distribution function (commonly also called an empirical cumulative distribution function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value. (<a href="https://en.wikipedia.org/wiki/Empirical_distribution_function">Wikipedia</a>) A CDF is a hypothetical model of a distribution, the ECDF models empirical (i.e. observed) data. (<a href="https://www.statisticshowto.com/empirical-distribution-function/">Statistics How To</a>)
Effect Size: |
  Effect size is a measure of the strength of a relationship; effect sizes are important in inferential statistics in order to determine and communicate whether a statistically significant result has practical importance. (SwR, Glossary)
Eta-squared: |
  Eta-squared is an [effect size] interpreted as the proportion of variability in the continuous outcome variable that is explained by groups in an analysis of variance; recent research suggests that eta-squared is biased and that [omega-squared] may be a less biased alternative following analysis of variance. (SwR, Glossary)
EU-SILC:
  European Survey on Income and Living Conditions (EU-SILC) is a survey on living conditions in the European Union. The survey helps to record living conditions, make poverty visible and monitor household income over the years. ([eurostat](https://ec.europa.eu/eurostat/web/microdata/european-union-statistics-on-income-and-living-conditions/))
Event Set: |
  In probability theory, we use Ω (the capital Greek letter omega) to indicate the set of all events.
Explained Variance: |
  Explained variance is variation in an outcome that is explained by a model; explained and unexplained variance are used in ANOVA and linear regression to compute model fit and model significance statistics. (SwR, Glossary)
Exponential Distribution: |
  The exponential distribution is often concerned with the amount of time until some specific event occurs. For example, the amount of time (beginning now) until an earthquake occurs has an exponential distribution.
  Values for an exponential random variable occur in the following way: There are fewer large values and more small values. For example, the amount of money customers spend in one trip to the supermarket follows an exponential distribution. There are more people who spend small amounts of money and fewer people who spend large amounts of money. ([LibreTexts Statistics](https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Introductory_Statistics_(OpenStax)/05%3A_Continuous_Random_Variables/5.04%3A_The_Exponential_Distribution))
Exposure: |
  Exposure is a characteristic, behavior, or other factor that may be associated with an outcome. (SwR, Glossary)
F-Distribution: |
  F-distribution is the probability distribution underlying the [F-statistic], which is used to determine statistical significance for ANOVA and linear regression. (SwR, Glossary)
F-Statistic: |
  F-statistic is a test statistic comparing explained and unexplained variance in [ANOVA] and linear regression. The F-statistic is a ratio where the variation between the groups is compared to the variation within the groups. (SwR, Glossary)
Factor: |
  Two integers that multiply to obtain a number are considered factors of that number. ([Khan Academy](https://www.khanacademy.org/math/algebra/x2f8bb11595b61c86:quadratics-multiplying-factoring/x2f8bb11595b61c86:intro-factoring/a/intro-to-polynomial-factors-and-divisibility))
False Negative (Rate): |
  A false negative is when a test returns negative while the truth is positive. A false negative rate is the probability of a false negative if the truth is positive. (Bayesian Thinking, Chap.1)
False Positive (Rate): |
  A false positive is when a test returns postive while the truth is negative. A false negative rate is the probability of a false negative if the truth is positive. (Bayesian Thinking, Chap.1)
Familywise: |
  Familywise error is the alpha or Type I error rate when conducting multiple statistical tests. A large familywise alpha is one of the reasons that analysis of variance is preferable to conducting multiple t-tests when comparing means across more than two groups. (SwR, Glossary)
Fisher’s exact test: |
  Fisher’s exact test is an alternative to the chi-squared test for use with small samples. (SwR, Glossary)
Fligner: |
  The Fligner-Killeen test is a non-parametric test for homogeneity of group variances based on ranks. It is useful when the data is non-normal or when there are outliers. (<a href="https://real-statistics.com/one-way-analysis-of-variance-anova/homogeneity-variances/fligner-killeen-test/">Real Statistics Using Excel</a>)
Floor: |
  A floor effect happens when a variable has many observations that take the lowest value of the variable, which can indicate that the range of values was insufficient to capture the true variability of the data. (SwR, Glossary)
Frequencies: |
  Frequencies are the number of times particular valuee of a variable occur. (SwR, Glossary)
Frequentist Statistics: |
  Also known as frequentist interference, is a type of statistical approach where conclusions are made based on the frequency of an event. This statistical approach determines the probability of a long-term experiment, meaning the experiment is repeated under the same set of conditions to obtain an outcome. In frequentist statistics the population parameters are fixed, but unknown, and the data observed in experiments are random. (<a href="https://deepai.org/machine-learning-glossary-and-terms/frequentist-statistics">deepai.org</a>)
Gender Nonconforming: |
  Gender nonconforming means not adhering to society's gender norms. People may describe themselves as gender nonconforming if they don't conform to the gender expression, presentation, behaviors, roles, or expectations that society sees as the norm for their gender. ([APA Guidelines - PDF](https://www.apa.org/practice/guidelines/transgender.pdf))
Gaussian Distribution: |
  A Gaussian distribution, also referred to as a normal distribution, is a type of continuous probability distribution that is symmetrical about its mean; most observations cluster around the mean, and the further away an observation is from the mean, the lower its probability of occurring. Like other probability distributions, the Gaussian distribution describes how the outcomes of a random variable are distributed. (<a href="https://www.math.net/gaussian-distribution">MATH.net</a>)
GAM: |
  A Generalized Additive Model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. They can be interpreted as the discriminative generalization of the naive Bayes generative model. ([Wikipedia](https://en.wikipedia.org/wiki/Generalized_additive_model)).
  GAMs relax the restriction that the relationship must be a simple weighted sum, and instead assume that the outcome can be modelled by a sum of arbitrary functions of each feature. ([Medium member story](https://towardsdatascience.com/generalised-additive-models-6dfbedf1350a#c407)) (Chap.4)
General Social Survey: |
  A large survey of a sample of people in the United States conducted regularly since 1972; the General Social Survey is abbreviated GSS and is conducted by the National Opinion Research Center at the University of Chicago. (Harris, Glossary)
GLM: |
  A generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value. (<a href="https://en.wikipedia.org/w/index.php?title=Generalized_linear_model&oldid=1175448680">Wikipedia</a>). the binary logistic regression model is one example of a generalized linear model. (SwR, Glossary)
Goodness-of-fit: |
  The chi-squared goodness-of-fit test is used for comparing the values of a single categorical variable to values from a hypothesized or population variable. The goodness-of-fit test is often used when trying to determine if a sample is a good representation of the population. (SwR, Chap 5)
Golem: |
  A golem (goh-lem) is a clay robot from Jewish folklore. It is used in SR2 as a methapher for a statistical model. (SR, Chap.1)
Grand: |
  Grand mean is the overall mean of a continuous variable that is used to determine distances from the mean for individuals and groups in ANOVA. (SwR, Glossary)
Grid Approximation: |
  One of the simplest conditioning techniques is grid approximation. While most parameters are continuous, capable of taking on an infinite number of values, it turns out that we can achieve an excellent approximation of the continuous posterior distribution by considering only a finite grid of parameter values. (SR2, Chap.2)
Grouped Bar Chart: |
  A grouped bar chart is a data visualization that shows two categorical variables in a bar chart where one group is shown along the x-axis for vertical bars or y-axis for horizontal bars and the other grouping is shown as separate bars within each of the first grouping variable categories; the bars are often different colors to distinguish the groups. (SwR, Glossary)
GSS: |
  A large survey of a sample of people in the United States conducted regularly since 1972; the General Social Survey is abbreviated GSS and is conducted by the National Opinion Research Center at the University of Chicago. (Harris, Glossary)
GUI: |
  A graphical user interface (GUI) is an operating system interface that enables user interactions with an electronic device through icons, images, and other graphical elements ([G2](https://www.g2.com/articles/graphical-user-interface)).
GVIF: |
  The generalized variance inflation factor (GVIF) is a generalized version of the variance inflation factor (VIF) that is used to identify problems with multicollinearity; the GVIF is used in binary logistic regression. (SwR, Glossary)
Hausman-McFadden: |
  The Hausman-McFadden test is a consistency test for multinomial logit models. If the independance of irrelevant alternatives applies, the probability ratio of every two alternatives depends only on the characteristics of these alternatives. (help page hfmtest() in the {mlogit} package)
HDCI: |
  Highest-density *continuous* interval, also known as the shortest probability interval. The same as HPDI. (tidybayes/<a href="https://mjskay.github.io/ggdist/reference/point_interval.html">ggdist</a>)
Heteroscedasticity: |
  Heteroscedasticity is a systematic change in the spread of the residuals over the range of measured values. Heteroscedasticity is a problem because ordinary least squares (OLS) regression assumes that all residuals are drawn from a population that has a constant variance ([homoscedasticity]). (<a href="https://statisticsbyjim.com/regression/heteroscedasticity-regression/">Statistics by Jim</a>)
Histograms: |
  Histograms are visual displays of data used to examine the distribution of a numeric variable. (SwR, Glossary)
HMC: |
  Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) method that uses the derivatives of the density function being sampled to generate efficient transitions spanning the posterior. It uses an approximate Hamiltonian dynamics simulation based on numerical integration which is then corrected by performing a Metropolis acceptance step. ([Stan Reference Manual](https://mc-stan.org/docs/reference-manual/hamiltonian-monte-carlo.html))
Homogeneity of Variances: |
  Homogeneity of variances is equal variances among groups; homogeneity of variance is one of the assumptions tested for independent and dependent t-tests and analysis of variance. (SwR, Glossary)
Homoscedasticity:
  Homoscedasticity is [homogeneity of variances], contrast is [Heteroscedasticity]. Homoscedasticity is an assumption of correlation and linear regression that requires that the variance of y be constant across all the values of x; visually, this assumption would show points along a fit line between x and y being evenly spread on either side of the line for the full range of the relationship. (SwR, Glossary)
HPDI: |
  Highest Posterior Density Interval (HPDI) is the Highest Density Interval (HDI) or Highest Density Region (HDR) of all possible regions of probability coverage, the HDR has the smallest region possible in the sample space. For a unimodal distribution it will include the mode (the maximum a posteriori, or MAP). (<a href="https://stats.stackexchange.com/a/464155/207389">Cross Validated</a>).
HSD: |
  Tukey’s Honestly Significant Difference (HSD) is a post hoc test to determine which means are statistically significantly different from each other following a significant ANOVA result; Tukey’s HSD compares each pair of means and so is considered a pairwise test, but it is less conservative than the Bonferroni post hoc test. (SwR, Glossary)
HTML: |
  HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript. Web browsers receive HTML documents from a web server or from local storage and render the documents into multimedia web pages. HTML describes the structure of a web page semantically and originally included cues for its appearance. (<a href="https://en.wikipedia.org/wiki/HTML">Wikipedia</a>)
HTTP: |
  The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems.[1] HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser. (<a href="https://en.wikipedia.org/wiki/HTTP">Wikipedia</a>)
Hypothesis: |
  A hypothesis is a model about how the world works that makes a prediction. All of our basic beliefs about the world are hypotheses. (BF, Chap.1)
IIA: |
  Independence of irrelevant alternatives (IIA) is an assumption of multinomial logistic regression that requires the categories of the outcome to be independent of one another; for a three-category outcome variable with Categories A, B, and C, the probability of being in Category A compared to Category B cannot change because Category C exists. (SwR, Glossary)
ICU: |
  International Components for Unicode (ICU) is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software. ICU is released under a nonrestrictive open source license that is suitable for use with both commercial software and with other open source or free software. (<a href="https://icu.unicode.org/home">ICU</a>)
iframe: |
  iframes allow you to embed HTML documents inside another HTML document. They offer a seamless way to integrate content from one source into another, enabling developers to create more dynamic and interactive web pages. ([dev.to](https://dev.to/joanayebola/what-is-iframe-and-how-to-use-them-1c63))
i.i.d.: |
  independent and identically distributed. Random variables are independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usually abbreviated as i.i.d., iid, or IID. (<a href="https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables">Wikipedia</a>) Example: Succession of fair coins throws are independent as the coin has no memory, so all thorws are independent. And every throw is 50:50 (heads:tails), so the coin is and stays fair - the distribution from which every throw is drawn, so to speak, is and stays the same: "identically distributed". (<a href="https://stats.stackexchange.com/a/17392/207389">Cross Validated</a>)
Independent: |
  Independent-samples t-test or unpaired sample t-test is an inferential test comparing two independent means. (SwR, Glossary)
Index of Qualitative Variation: |
  The index of qualitative variation (IQV) is a type of statistic used to measure variation for nominal variables, which is often computed by examining how spread out observations are among the groups. (SwR, Glossary)
Inferential Statistics: |
  Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. It makes propositions about a population, using data drawn from the population with some form of sampling. Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists of (first) selecting a statistical model of the process that generates the data and (second) deducing propositions from the model. Inferential statistics can be contrasted with descriptive statistics. (<a href="https://en.wikipedia.org/wiki/Statistical_inference">Wikipedia</a>)
Influential Observation: |
  An influential observation is an observation that changes the slope of a regression line. (SwR, Glossary) Influential points are observed data points that are far from the other observed data points in the horizontal direction. These points may have a big effect on the slope of the regression line. To begin to identify an influential point, you can remove it from the data set and see if the slope of the regression line is changed significantly. (<a href= "https://openstax.org/books/introductory-statistics/pages/12-6-outliers">Introductory Statistics 12.6</a>)
Interaction: |
  The importance of one variable depends upon another. For example, plants benefit from both light and water. But in the absence of either, the other is no benefit at all. If variable interact than effective inference about one variable will depend upon consideration of others. (Chap.5)
Intercept: |
  The intercept is the value of the dependent variables if all independent variables have the value zero. (<a href="https://link.springer.com/referenceworkentry/10.1007/978-94-007-0753-5_1486">Intercept, Slope in Regression</a>)
IQR: |
  The interquartile range (IQR) is the upper and lower boundaries around the middle 50\% of the data in a numeric variable or the difference between the upper and lower boundaries around the middle 50\% of the data in a numeric variable. (SwR, Glossary)
IQV: |
    The index of qualitative variation (IQV) is a type of statistic used to measure variation for nominal variables, which is often computed by examining how spread out observations are among the groups. (SwR, Glossary)
JAGS: |
  JAGS is Just Another Gibbs Sampler.  It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation. (<a href="https://mcmc-jags.sourceforge.io/">JAGS</a>)
Joint Probability: |
  Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. (<a href="https://en.wikipedia.org/w/index.php?title=Joint_probability_distribution&oldid=1159896069">Wikipedia</a>)
JPG: |
  JPEG (/ˈdʒeɪpɛɡ/ JAY-peg, short for Joint Photographic Experts Group is a commonly used method of lossy compression for digital images. [(Wikipedia](https://en.wikipedia.org/wiki/JPEG))
Kernel Density Estimation: |
  Kernel density estimation (KDE) extrapolates data to an estimated population probability density function. It’s called kernel density estimation because each data point is replaced with a kernel—a weighting function to estimate the pdf. The function spreads the influence of any point around a narrow region surrounding the point. (<a href="https://www.statisticshowto.com/kernel-density-estimation/">Statistics How To</a>)
KFF: |
  Kaiser Family Foundation (KFF) is a non-partisan organization focused on health policy. It conducts its own research, polling, journalism, and specialized public health information campaigns and its website has been heralded for having the "most up-to-date and accurate information on health policy"[4] and as a "must-read for healthcare devotees." (<a href="https://en.wikipedia.org/wiki/Kaiser_Family_Foundation">Wikipdia</a>)
Knots: |
  B-splines divide the full range of some predictor variable into parts called knots. Knots are cutpoints that defines different regions (or partitions) for a variable. In each regions, a fitting must occurs. The definition of different regions is a way to stay local in the fitting process. ([DataCademia | Statistics Knots (Cut Points)](https://datacadamia.com/data_mining/knot)) (Chap.4)
Kolmogorov-Smirnov: |
  The Kolmogorov-Smirnov test is used when the assumption of equal variances (homogeneity of variances) fails for the independent-samples t-test; the test compares the distributions of the groups rather than their means. (SwR, Glossary)
Kruskal-Wallis: |
  Kruskal-Wallis test is used to compare ranks across three or more groups when the normal distribution assumption fails for analysis of variance (ANOVA) (SwR, Glossary)
Kurtosis: |
  Kurtosis is a measure of how many observations are in the tails of a distribution; distributions that look bell-shaped, but have a lot of observations in the tails (platykurtic) or very few observations in the tails (leptokurtic) (SwR, Glossary)
Labelled Data: |
  Labelled data (or labelled vectors) is a common data structure in other statistical environments to store meta-information about variables, like variable names, value labels or multiple defined missing values. (<a href="https://strengejacke.github.io/sjlabelled/articles/intro_sjlabelled.html">sjlabelled</a>)
Leptokurtic: |
  Leptokurtic is a distribution of a numeric variable that has many values clustered around the middle of the distribution; leptokurtic distributions often appear tall and pointy compared to mesokurtic or platykurtic distributions. (SwR, Glossary)
Levene: |
  Levene’s test is a statistical test to determine whether observed data meet the homogeneity of variances assumption; Levene’s test is used to test this assumption for t-tests and analysis of variance. (SwR, Glossary)
Leverage: |
  You can think of the regression line being balanced at the x-mean and the further from that location a point is, the more a single point can move the line. We can measure the distance of points from the mean to quantify each observation’s potential for impact on the line using what is called the leverage of a point. Leverage is a positive numerical measure with larger values corresponding to more leverage. The scale changes depending on the sample size (n) and the complexity of the model so all that matters is which observations have more or less relative leverage in a particular data set. (<a href="https://stats.libretexts.org/Bookshelves/Advanced_Statistics/Intermediate_Statistics_with_R_(Greenwood)/06%3A_Correlation_and_Simple_Linear_Regression/6.09%3A_Outliers_-_leverage_and_influence">Outliers - leverage and influence</a>)
Linearity: |
  Linearity is the assumption of some statistical models that requires the outcome, or transformed outcome, to have a linear relationship with numeric predictors, where linear relationships are relationships that are evenly distributed around a line. (SwR, Glossary)
Linear Model: |
  A linear model specifies a linear relationship between a dependent variable and n independent variables. It conforms to a mathematical model represented by a linear equation of the form Y = b_{1}X_{1} + b_{2}X_{2} + … + b_{n}X_{n}. ([Oxford Reference](https://www.oxfordreference.com/display/10.1093/oi/authority.20110803100107198))
Linear Regression: |
  Linear regression is used to predict the value of an outcome variable Y based on one or more input predictor variables X. The aim is to establish a linear relationship (a mathematical formula) between the predictor variable(s) and the response variable, so that, we can use this formula to estimate the value of the response Y, when only the predictors (Xs) values are known. ([r-statistics.co](https://r-statistics.co/Linear-Regression.html)) (Chap.4)
Linear Transformation: |
  Linear transformation are transformations that keep existing linear relationships between variables, often by multiplying or dividing one or both of the variables by some amount. (SwR, Glossary)
Line Graph: |
  A line graph is a visual display of data often used to examine the relationship between two continuous variables or for something measured over time. (SwR, Glossary)
Likelihood: |
  The likelihood function (often simply called the likelihood) is the joint probability (or probability density) of observed data viewed as a function of the parameters of a statistical model. (<a href="https://en.wikipedia.org/wiki/Likelihood_function">Wikipedia</a>) It indicates how likely a particular population is to produce an observed sample. (<a href="https://www.statistics.com/glossary/likelihood-function/>statistics.com</a>) It is the probability of the data given our beliefs about the data: P(data | belief). (BF, Chap.8)
Likelihood-ratio: |
  The likelihood ratio test is a test that compares two nested binary logistic regression models to determine which is a better fit to the data; the difference between two log-likelihoods follows a chi-squared distribution with a significant result indicating the larger model is a better fitting model. (SwR, Glossary)
LIS: |
  The 'Luxembourg Income Study Database' (LIS) is a data archive and research center dedicated to cross-national analysis. It is home to two databases: The Luxembourg Income Study Database (LIS) and Luxembourg Wealth Study Database (LWS). [Lis Data Center](https://www.lisdatacenter.org/about-lis/)
Literate Programming: |
  Literate programming is a methodology that combines a programming language with a documentation language, thereby making programs more robust, more portable, more easily maintained, and arguably more fun to write than programs that are written only in a high-level language. The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer.(<a href="https://www-cs-faculty.stanford.edu/~knuth/lp.html">Donald Knuth</a>)
Loess: |
  Loess curve is a graph curve that shows the relationship between two variables without constraining the line to be straight; it can be compared to a linear fit line to determine whether the relationship is close to linear or not (= checking the [linearity] assumption). The procedure originated as LOWESS (LOcally WEighted Scatter-plot Smoother). is a nonparametric method because the linearity assumptions of conventional regression methods have been relaxed. It is called local regression because the fitting at say point x is weighted toward the data nearest to x. (SwR, Glossary and <a href="https://www.statsdirect.com/help/Default.htm#nonparametric_methods/loess.htm">LOESS Curve Fitting (Local Polynomial Regression</a>))
Logit Transformations: |
  Logit transformations are transformations that takes the log value of p/(1-p); this transformation is often used to normalize percentage data and is used in the logistic model to transform the outcome. (SwR, Glossary)
Loss Function: |
  A loss function is a rule that tells you the cost associated with using any particular point estimate. … The key insight is that *different loss functions imply different point estimates*. (SR2, p.59)
Mahalanobis: |
  Mahalanobis distance is a measure of how far a point is from the mean of a multivariate distribution, normalized by the covariance matrix of the distribution. It is calculated as the square root of the product of the difference vector, the inverse covariance matrix, and the transpose of the difference vector. (Google)
Mann-Whitney: |
  Mann-Whitney U test, also called Wilcoxon rank sum test, is an alternative for comparing a numeric or ordinal variable across two groups when the independent-samples t-test assumption of normality is not met. (SwR, Glossary)
Main Effect: |
  Main effect is the relationship between only one of the independent variables and the dependent variable, ignoring the impact of any additional independent variables or interaction effects. (SwR, Glossary)
MAP: |
  In Bayesian statistics a Maximum A Posteriori probability or MAP is essentially the mode of [posterior distribution]. (CDS, p.272)
Marginal Distribution: |
  It is the probability distribution of each of the individual variables. A marginal distribution gets it’s name because it appears in the margins of a probability distribution table. ([Statology](https://www.statology.org/marginal-distribution/), [Statistics How To](https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/marginal-distribution/)) (Chap.4)
Markov Chain: |
  A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." ([Wikipedia](https://en.wikipedia.org/wiki/Markov_chain)) For example, if you made a Markov chain model of a baby's behavior, you might include "playing," "eating", "sleeping," and "crying" as states, which together with other behaviors could form a 'state space': a list of all possible states. In addition, on top of the state space, a Markov chain tells you the probabilitiy of hopping, or "transitioning," from one state to any other state---e.g., the chance that a baby currently playing will fall asleep in the next five minutes without crying first. ([Explained visually](https://setosa.io/ev/markov-chains/))
Mean Square: |
  Mean square is the mean of the squared differences between two values; mean squares are used to compute the F-statistic in analysis of variance and linear regression. (Swr, Glossary)
MER: |
  The Market Exchange Rate (MER) is the rate at which one currency can be exchanged for another.
MCMC: |
  Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a [markov chain] that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains. (<a href="https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo">Wikipedia</a>)
McNemar’s test: |
  McNemar’s test is an alternative to the chi-squared test of independence for when observations are not independent and both variables are binary; for example, McNemar’s test could be used to compare proportions in two groups before and after an intervention (SwR, Glossary)
MDN: |
  MDN Web Docs, previously Mozilla Developer Network and formerly Mozilla Developer Center, is a documentation repository and learning resource for web developers. MDN Web Docs content is maintained by Mozilla, Google employees, and volunteers (community of developers and technical writers). It also contains content contributed by Microsoft, Google, and Samsung. Topics include HTML5, JavaScript, CSS, Web APIs, Django, Node.js, WebExtensions, MathML, and others. (<a href="https://en.wikipedia.org/wiki/MDN_Web_Docs">Wikipedia</a>)
Mean Absolute Deviation: |
  The mean absolute deviation (MAD) also known as the mean deviation and average absolute deviation is a measure of variability that indicates the average distance between observations and their mean. MAD uses the original units of the data, which simplifies interpretation. Larger values signify that the data points spread out further from the average. Conversely, lower values correspond to data points bunching closer to it. (<a href="https://statisticsbyjim.com/basics/mean-absolute-deviation/">Statistics by Jim</a>). The R base function `mad()` stands for the median absolute deviation.
Median: |
  Median is the middle value, or the mean of the two middle values, for a variable. (SwR, Glossary)
Mesokurtic: |
  Mesokurtic are distributions that are neither platykurtic nor leptokurtic are mesokurtic; a normal distribution is a common example of a mesokurtic distribution. (SwR, Glossary)
Metropolis Algorithm: |
  The Metropolis Algorithm often also called Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult. This sequence can be used to approximate the distribution (e.g. to generate a histogram) or to compute an integral (e.g. an expected value). (a href="https://en.wikipedia.org/w/index.php?title=Metropolis%E2%80%93Hastings_algorithm&oldid=1172902257">Wikipedia</a>)
Metropolis–Hastings Algorithm: |
  See: Metropolis Algorithm
Microdata: |
  Microdata are unit-level data obtained from sample surveys, censuses, and administrative systems. They provide information about characteristics of individual people or entities such as households, business enterprises, facilities, farms or even geographical areas such as villages or towns. They allow in-depth understanding of socio-economic issues by studying relationships and interactions among phenomena. Microdata are thus key to designing projects and formulating policies, targeting interventions and monitoring and measuring the impact and results of projects, interventions and policies. ([The World Bank](https://datahelpdesk.worldbank.org/knowledgebase/articles/228873-what-do-we-mean-by-microdata))
MLE: |
  Maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. (<a href="https://en.wikipedia.org/w/index.php?title=Maximum_likelihood_estimation&oldid=1178502960">Wkipedia</a>)
Mode xy: |
  Mode is the most common value of a variable. (SwR, Glossary)
Model-fit: |
  Model fit means how well the model captures the relationship in the observed data. (SwR, Glossary)
Monomial: |
  A monomial is an algebraic expression that has just one term. For example, \[3x\] and \[-5y^2\] are both monomials. ([Khan Academy](https://www.khanacademy.org/math/algebra/x2f8bb11595b61c86:quadratics-multiplying-factoring/x2f8bb11595b61c86:factor-quadratics-strategy/a/quadratics-multiplying-factoring-faq))
Monotonic: |
  Monotonic is a statistical relationship that, when visualized, goes up or down, but not both. (SwR, Glossary)
Mosaic Plots: |
  Mosaic plots are visual representations of data to show the relationship between two categorical variables; useful primarily when both variables have few categories. (SwR, Glossary)
Multiple Regression: |
  Multiple Regression uses more than one predictor variable to simultaneously model an outcome. (Chap.5)
National Accounts: |
  National accounts are a system of accounts and balance sheets that provide a broad and integrated framework to describe an economy, whether a region, a country, or a group of countries such as the European Union (EU). For internationally comparable national accounts this system needs to be based on common concepts, definitions, classifications and accounting rules, in order to arrive at a consistent, reliable and comparable quantitative description of an economy. National accounts provide systematic and detailed economic data useful for economic analysis to support the development and monitoring of policy-making. ([eurostat: Statics explained](https://ec.europa.eu/eurostat/statistics-explained/index.php?title=National_accounts_-_an_overview)).
National Income: |
  National income is the sum of all incomes received by individuals residents in a given country over a year. Incomes takes various forms and we typically distinguish two broad sources: incomes stemming from individuals’ labor (e.g. wages or salaries) and incomes stemming from individuals’ wealth (e.g. interest and dividends). (WIR2022, p.20)
National Wealth: |
  National wealth is the sum of the value of all assets owned by individuals in a given country. It is stock resulting from capital accumulation (from savings, i.e. income that has not been consumed) and price effects. (WIR2022, p.20)
Negation: |
  The probability of X and the negation of the probability of X sum to 1 (in other words, values are either X, or not X). The ¬ symbol means “negation” or “not.” (BF, Chap.2)
NegativeCorr: |
  Negative correlation is a statistical relationship where two things move in opposite directions; as one goes up, the other goes down, and vice versa. (SwR, Glossary)
Negatively Skewed: |
  A distribution is negatively skewed when it has extreme values in the left-hand-side tail, toward the negative numbers on the number line; negative skew is also known as left skew. (SwR, Glossary)
Net National Income:
  The net national income equals the gross domestic product (GDP) minus capital depreciation, plus net foreign income. In addition to the income of private households, the national income also includes the income of the other domestic sectors (i.e., the nonfinancial corporations, the financial corporations, the general government, and the nonprofit institutions serving households)([Distributional National Accounts (DINA) for Austria, 2004-2016](https://wid.world/document/distributional-national-accounts-dina-for-austria-2004-2016-world-inequality-lab-wp-2020-23/)).
NIMBLE: |
  NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods. Other packages that use the BUGS language are only for Markov chain Monte Carlo (MCMC). With NIMBLE, you can turn BUGS code into model objects and use them for whatever algorithm you want. (<a href="https://r-nimble.org/">r-nimble.org</a>)
NHANES: |
  The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations. (<a href="https://www.cdc.gov/nchs/nhanes/about_nhanes.htm">NHANES</a>)
NHST: |
  Null Hypothesis Significance Testing (NHST) is a process for organizing inferential statistical tests. (SwR, Glossary)
Non-informative Prior: |
  A prior distribution which is specified in an attempt to be non commital about a parameter, for example, a uniform distribution. (The Cambridge Dictionary of Statistics, CDS, p.303)
Nonlinear Transformations: |
  Nonlinear transformations are transformations that increases (or decreases) the linear relationship between two variables by applying an exponent (i.e., [power transformation]) or other function to one or both of the variables. (SwR, Glossary)
Null Hypothesis: |
  The null hypothesis (H0, or simply the Null) is a statement of no difference or no association that is used to guide statistical inference testing (SwR, Glossary)
NUTS: |
  The abbreviation "NUTS" stands for **No U-Turn Sampler** and is a Hamiltonian Monte Carlo (HMC) Method. This means that it is not a Markov Chain method and thus, this algorithm avoids the random walk part, which is often deemed as inefficient and slow to converge. Instead of doing the random walk, NUTS does jumps of length x. Each jump doubles as the algorithm continues to run. This happens until the trajectory reaches a point where it wants to return to the starting point. ([CrossValidated](https://stats.stackexchange.com/questions/311813/can-somebody-explain-to-me-nuts-in-english)) (Chap.4 in my notes)
Odds Ratio: |
  Odds is usually defined in statistics as the probability an event will occur divided by the probability that it will not occur. An odds ratio (OR) is a measure of association between a certain property A and a second property B in a population. Specifically, it tells you how the presence or absence of property A has an effect on the presence or absence of property B. (<a href="https://www.statisticshowto.com/probability-and-statistics/probability-main-index/odds-ratio/">Statistics How To</a>). An odds ratio is a ratio of two ratios. They quantify the strength of the relationship between two conditions. They indicate how likely an outcome is to occur in one context relative to another. (<a href="https://statisticsbyjim.com/probability/odds-ratio/">Statistics by Jim</a>)
OLS: |
  Ordinary least square regression (OLS) is a method of estimating a linear regression model that finds the regression line by minimizing the squared differences between each data point and the regression line. (Swr; Glossary)
Omega-squared: |
  Omega-squared is an effect size for determining the strength of a relationship following an analysis of variance ([ANOVA]) statistical test. In contrast to [eta-squared] it is adjusted to account for the positive bias, and is more stable when assumptions are not completely met. (SwR, Glossary)
Omnibus: |
 An omnibus is a statistical test that identifies that there is some relationship going on between variables, but not what that relationship is. (SwR, Glossary)
One-to-one Function: |
  A one-to-one function is a function in which each output value corresponds to exactly one input value. (Precalculus, p.21)
One-sample: |
  One-sample t-test, also known as the single-parameter t-test or single-sample t-test, is an inferential statistical test comparing the mean of a numeric variable to a population or hypothesized mean. (SwR, Glossary)
Outcome: |
  Outcome is the variable being explained or predicted by a model; in linear and logistic regression, the outcome variable is on the left-hand side of the equal sign. (SwR, Glossary)
Outliers: |
 Outliers are observations with unusual values. (SwR, Glossary). Outliers are observed data points that are far from the least squares line. They have large "errors", where the "error" or residual is the vertical distance from the line to the point. (<a href= "https://openstax.org/books/introductory-statistics/pages/12-6-outliers">Introductory Statistics 12.6</a>)
OWID: |
    'Our World In Data' (OWID) is an online publication that focuses on large global problems such as poverty, disease, hunger, climate change, war, existential risks, and inequality. [Wikipedia](https://en.wikipedia.org/wiki/Our_World_in_Data)
p-hacking: |
  P-hacking is a set of statistical decisions and methodology choices during research that artificially produces statistically significant results. These decisions increase the probability of false positives—where the study indicates an effect exists when it actually does not. P-hacking is also known as data dredging, data fishing, and data snooping. (<a href="https://statisticsbyjim.com/hypothesis-testing/p-hacking/">Statistics by Jim</a>)
p-value: |
  The p-value is the probability that the test statistic is at least as big as it is under the null hypothesis (SwR, Glossary)
Paired: |
  Dependent-samples test or paired-samples t-test is an inferential test comparing two related means . (SwR, Glossary)
Pairwise Comparisons: |
  Pairwise comparisons are comparisons between every pair of groups to identify which are statistically significantly different from one another following a statistically significant result in an analysis of variance (ANOVA) or other multigroup analysis. (SwR, Glossary)
Parameter: |
  Unobserved variables are usually called Parameters. (SR2, Chap.2) A parameter is an unknown numerical characteristics of a population that must be estimated. (CDS). They are also numbers that govern statistical models ([stats.stackexchange](https://stats.stackexchange.com/a/255994/207389)). A parameter is also a number that is a defining characteristic of some population or a feature of a population. (SwR, Glossary)
PartialCorr: |
  Partial correlation is a standardized measure of the amount of variance two variables share after accounting for variance they both share with a third variable. (SwR, Glossary)
Partial-F: |
  Partial-F test is a statistical test to see if two nested linear regression models are statistically significantly different from each other; this test is usually used to determine if a larger model accounts for enough additional variance to justify the complexity in interpretation that comes with including more variables in a model. (SwR, Glossary)
PDF: |
  A probability density function (PDF) describes a probability distribution for a random, continuous variable. Use a probability density function to find the chances that the value of a random variable will occur within a range of values that you specify. More specifically, a PDF is a function where its integral for an interval provides the probability of a value occurring in that interval. (<a href="https://statisticsbyjim.com/probability/probability-density-function/">Statistics By Jim</a>)
PDMP: |
  In the United States, prescription monitoring programs (PMPs) or prescription drug monitoring programs (PDMPs) are state-run programs which collect and distribute data about the prescription and dispensation of federally controlled substances and, depending on state requirements, other potentially abusable prescription drugs. PMPs are meant to help prevent adverse drug-related events such as opioid overdoses, drug diversion, and substance abuse by decreasing the amount and/or frequency of opioid prescribing, and by identifying those patients who are obtaining prescriptions from multiple providers (i.e., "doctor shopping") or those physicians overprescribing opioids. (<a href="https://en.wikipedia.org/wiki/Prescription_monitoring_program">Wikipedia</a>)
Pearson: |
  Pearson’s r is a statistic that indicates the strength and direction of the relationship between two numeric variables that meet certain assumptions. (SwR, Glossary)
Percentages: |
  Percentages are a relative values indicating hundredth parts of any quantity. (<a href="https://www.britannica.com/topic/percentage">Britannica</a>)
Percentile: |
  The set of divisions that produce exactly 100 equal parts in a series of continuous values, such as blood pressure, weight, height, etc. Thus a person with blood pressure above the 80th percentile has a greater blood pressure value than over 80% of the other recorded values.” (CDS, p.323)
Permutation: |
  A mathematical technique that determines the number of possible arrangements in a set when the order (in contrast to COMBINATION) matters. The study of permutations is an important topic in the fields of [combinatorics].
Pew Research Center: |
  The Pew Research Center (also simply known as Pew) is a nonpartisan American think tank based in Washington, D.C. It provides information on social issues, public opinion, and demographic trends shaping the United States and the world. It also conducts public opinion polling, demographic research, random sample survey research, and panel based surveys, media content analysis, and other empirical social science research. (<a href="https://en.wikipedia.org/wiki/Pew_Research_Center">Wikipedia</a>)
Phi coefficient: |
  The phi coefficient is a meassure of effect size to determine the strength of the relationship between two binary variables; often reported with the results of a chi-squared test (SwR, Glossary)
PI: |
  [Percentile] intervals (PIs) assign equal mass to each tail. They are used for the computation [quantile]s. (<a href = "https://rdrr.io/github/rmcelreath/rethinking/man/HPDI.html">Help for HPDI {rethinking}</a>)
Pie Charts: |
  Pie charts are used to show parts of a whole; pie charts get their name from looking like a pie with pieces representing different groups, and they are not recommended for most situations because they can be difficult to interpret
PIP: |
  The 'Poverty and Inequality Platform' (PIP) is an interactive computational tool that offers users quick access to the World Bank’s estimates of poverty, inequality, and shared prosperity. PIP provides a comprehensive view of global, regional, and country-level trends for more than 160 economies around the world. [The World Bank](https://pip.worldbank.org/about#PIP_AT_A_GLANCE)
Planned Comparisons: |
  Planned comparisons is a statistical strategy for comparing different groups, often used after a statistically significant analysis of variance to test hypotheses about which group means are statistically significantly different from one another. (SwR, Glossary)
Platykurtic: |
  Platykurtic is a distribution of a numeric variable that has more observations in the tails than a normal distribution would have; platykurtic distributions often look flatter than a normal distribution. (SwR, Glossary)
PMF: |
  A probability mass function (PMF) is a mathematical function that calculates the probability a discrete random variable will be a specific value. PMFs also describe the probability distribution for the full range of values for a discrete variable. Probability mass functions find the LIKELIHOOD of a particular outcome. Using a PMF to calculate the likelihoods for all possible values of the discrete variable produces its PROBABILITY DISTRIBUTION.(<a href="https://statisticsbyjim.com/probability/probability-mass-function/">Statistics By Jim</a>)
Point Charts: |
  Point charts are charts that show summary values for a numeric variable, typically across groups; for example, a point chart could be used in place of a bar graph to show mean or median across groups. (SwR, Glossary)
Polynomials: |
  Polynomials are algebraic expression made up of one or more terms. MONOMIALS and BINOMIALS are both types of polynomials. Other examples include \[2x^2+3x+1\] and \[-5y^3+2y^2-6y+8\]. ([Khan Academy](https://www.khanacademy.org/math/algebra/x2f8bb11595b61c86:quadratics-multiplying-factoring/x2f8bb11595b61c86:factor-quadratics-strategy/a/quadratics-multiplying-factoring-faq))
Polynomial Degree: |
  The degree of a polynomial term is the sum of the exponents of the variables that appear in it. (<a href="https://en.wikipedia.org/wiki/Degree_of_a_polynomial">Wikipedia</a>)
Polynomial Regression: |
  It is a form of regression analysis in which the relationship between the independent variable `x` and the dependent variable `y` is modelled as an nth degree polynomial in `x`. Polynomial regression fits a nonlinear relationship between the value of `x` and the corresponding conditional mean of `y`. Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear. What “linear” means in this context is that $\mu_{i}$ is a linear function of any single parameter. For this reason, polynomial regression is considered to be a special case of multiple linear regression. ([Wikipedia](https://en.wikipedia.org/wiki/Polynomial_regression)) (Chap.4)
Pooled Variance: |
  Pooled variance is the assumption that the variances in two groups are equal, so these variances are combined ('pooled') (SwR, Glossary).
Population: |
  A population consists statistically of all the observations that fit some criterion; for example, all of the people currently living in the country of Bhutan or all of the people in the world currently eating strawberry ice cream. (SwR, Glossary)
PositiveCorr: |
  Positive correlation is a statistical relationship where two things move together in the same direction; as one goes up, the other also goes up, or as one goes down, the other also goes down. (SwR, Glossary)
Positively Skewed: |
  A distribution is positively skewed when it has some extreme large positive values relative to the rest of the values in the distribution, making the tail of the distribution extend to the right or in the direction of larger positive numbers; this type of skew is also known as right skew (SwR, Glossary)
Posterior Distribution: |
  Given prior information combined with data from observations or experiments, the posterior summarizes all you know after factoring in that new evidence. It provides estimates of parameters like intervals or points as well as predictions about future data outcomes through probabilistic evaluations to help inform decisions under uncertain conditions. (a href="https://www.statisticshowto.com/posterior-distribution-probability/">Statistics HowTo</a>)
Posterior Odds: |
  In probability theory, odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. The odds of an outcome are the ratio of the probability that the outcome occurs to the probability that the outcome does not occur. (<a href="https://en.wikipedia.org/wiki/Odds">Wikipedia</a>)
Posterior Predictive Distribution: |
  An approach to assessing model fit. It is the distribution for future predicted data based on the data you have already seen. Measures of discrepancy between the estimated model and the data are constructed and their posterior predictive distribution compared to the discrepancy observed for the dataset. (CDS, p. 334)
Posterior Probability: |
  It is the revised or updated probability of an event occurring after taking into consideration new information. ([Investopedia](https://www.investopedia.com/terms/p/posterior-probability.asp)). Posterior probability = prior probability + new evidence (called likelihood). ([Statistics How To](https://www.statisticshowto.com/posterior-distribution-probability/)) The posterior distribution will be a distribution of Gaussian distributions. (SR, Chap.4). It quantifies exactly how much our observed data changes our beliefs: P(belief | data) (BF, Chap.8)
POTNI:
  Post-tax National Income (POTNI) adds all other government transfers and deducts all taxes from the PRTNI. Therefore, POTNI includes all social monetary transfers, transfers in-kind and collective consumption. The allocation of all forms of government spending to individuals ensures that the sum of POTNI equals the national income ([Distributional National Accounts (DINA) for Austria, 2004-2016](https://wid.world/document/distributional-national-accounts-dina-for-austria-2004-2016-world-inequality-lab-wp-2020-23/)).
Power: |
  Statistical Power is the probability that the results of a test are not a Type II error; it is the probability of finding a relationship when there is a relationship. (SwR, Glossary)
Power Transformations: |
  Power transformations are transformations of a measure using an exponent like squaring or cubing or taking the square root or cube root; power transformations are nonlinear transformations. (SwR, Glossary)
PPP: |
  Purchasing Power Parity (PPP) is the exchange rate that equates the price of a basket of identical traded goods and services in two countries. Converting values to PPP therefore accounts for differences in costs of living between countries, enabling comparisons between income levels in different countries. (<a href="https://wir2022.wid.world/www-site/uploads/2021/12/WIR2022-Technical-Note-Figures-Tables-1.pdf">WIR2022 - Technical Notes</a>, p.6)
Prediction: |
  Prediction values are values of the outcome variable that were determined by substituting data for the independent variables into a regression model and computing the predicted value of the outcome. (SwR, Glossary)
Predictor Variable: |
  Predictor variable -- also known sometimes as the independent or explanatory variable -- is the counterpart to the response or dependent variable. Predictor variables are used to make predictions for dependent variables. ([DeepAI](https://deepai.org/machine-learning-glossary-and-terms/predictor-variable), [MiniTab](https://support.minitab.com/en-us/minitab/21/help-and-how-to/statistical-modeling/regression/supporting-topics/basics/what-are-response-and-predictor-variables/))
Prevalence: |
  Prevalence is the proportion of individuals in a population who have a specific characteristic at a certain time period. (<a href="https://www.nimh.nih.gov/health/statistics/what-is-prevalence">NIH<a>, <a href="https://www.statology.org/prevalence-in-statistics/">Statology</a>)
Prior Odds: |
  When comparing two events, it common to phrase probability statements in terms of odds. Prior odds are the ratio of the a priory beliefs of a probability that the outcome occurs to the probability that the outcome does not occur before seeing the data. (<a href="https://ocw.mit.edu/courses/18-05-introduction-to-probability-and-statistics-spring-2014/65200614be80c2c1efcd0f9f3db8c0e7_MIT18_05S14_Reading12b.pdf">Bayesian Updating: Odds</a>)
Prior Predictive Simulation: |
  It is an essential part of modeling. Once you’ve chosen priors for all variables these priors imply a joint prior distribution. By simulating from this distribution, you can see what your choices imply about the observable variable. This helps to diagnose bad choices. Prior predictive simulation is therefore very useful for assigning sensible priors. (Chap.4)
Prior Probability: |
  The Prior Probability, also called the Prior, is the assumed probability distribution before we have seen the data. (<a href="https://en.wikipedia.org/wiki/Prior_probability">Wikipedia</a>) It quantifies how likely our initial belief is: P(belief). (BF, Chap.8)
Probabilities: |
  Probability is a mathematical tool used to study randomness. It deals with the chance of an event occurring. ([OpenStax: Statistics](https://openstax.org/books/statistics/pages/1-1-definitions-of-statistics-probability-and-key-terms)) In the discrete case, to calculate the probability that a random variable takes on any value within a range, we sum the individual probabilities corresponding to each of the values. We use Pr to explicitly state that the result is a probability from a discrete probability distribution, whereas p(value) is a probability density from a continuous probability distribution. (Bayesian Statistics, Chap.3)
Probability Density Function: |
   A probability density function (PDF) tells us the probability that a random variable takes on a certain value. (<a href="https://www.statology.org/cdf-vs-pdf/">Statology</a>) The probability density function (PDF) for a given value of random variable X represents the density of probability (probability per unit random variable) within a particular range of that random variable X. Probability densities can take values larger than 1. ([StackExchange Mathematics](https://math.stackexchange.com/a/1464837/1215136)) We can use a continuous probability distribution to calculate the probability that a random variable lies within an interval of possible values. To do this, we use the continuous analogue of a sum, an integral. However, we recognise that calculating an integral is equivalent to calculating the area under a probability density curve. We use `p(value)` for probability densities and `Pr` for probabilities. (Bayesian Statistics, Chap.3)
Probability Distribution: |
  It is a way of describing all possible events and the probability of each one happening. Probability distributions are also very useful for asking questions about ranges of possible values. (BF, Chap.4) The two defining features are: (1) All values of the distribution must be real and non-negative. (2) The sum (for discrete random variables) or integral (for continuous random variables) across all possible values of the random variable must be 1. (BS, Chap.3)
Probability Mass Function: |
  A probability mass function (PMF) is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as probability function, frequency function or discrete probability density function. (<a href="https://en.wikipedia.org/wiki/Probability_mass_function">Wikipedia</a>)
Prolog: |
  A set of comments at the top of a code file that provides information about what is in the file (SwR, Glossary)
Proportional-Odds-Ratios: |
  Proportional odds ratios are odds ratios resulting from ordinal regression that represent the odds of being in a higher group or groups compared to being in all the lower groups with each one-unit increase in the predictor. (SwR, Glossary)
Proportional Operator: |
  The proportional symbol (`∝`) is pronounced als "varies as" or "is proportional to". ([UEfAP](http://www.uefap.com/speaking/symbols/symbols.htm)) It is produced in Markdown with `$\propto$`. (<a href="https://oeis.org/wiki/List_of_LaTeX_mathematical_symbols">List of LaTeX mathematical symbols</a>)
Protocol: |
  A protocol is a system of rules that define how data is exchanged within or between computers. Communications between devices require that the devices agree on the format of the data that is being exchanged. The set of rules that defines a format is called a protocol. (<a href="https://developer.mozilla.org/en-US/docs/Glossary/Protocol">MDN web docs</a>)
PRTNI:
  Pre-tax National Income (PRTNI) is the sum of all income flows gained by the individual owner of the factors of production, labour and capital, after taking into account the operation of the pension system as well as unemployment insurance system. Accordingly, pensions and unemployment benefits are included ([Distributional National Accounts (DINA) for Austria, 2004-2016](https://wid.world/document/distributional-national-accounts-dina-for-austria-2004-2016-world-inequality-lab-wp-2020-23/)).
Q-Q-Plot: |
  A quantile-quantile plot is a visualization of data using probabilities to show how closely a variable follows a normal distribution. (SwR, Glossary) This plot is made up of points below which a certain percentage of the observations fall. On the x-axis are normally distributed values with a mean of 0 and a standard deviation of 1. On the y-axis are the observations from the data. If the data are normally distributed, the values will form a diagonal line through the graph. (SwR, chapter 6)
QRP: |
  Questionable Research Practice (QRP) is a research practice that introduces bias, usually in pursuit of statistical significance; an example of such practices might be dropping or recoding values or variables solely to improve a model fit statistic. (SwR, GLossary)
Quantile: |
  Quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities (<a href="https://en.wikipedia.org/wiki/Quantile">Wikipedia</a>)
Quadratic Approximation: |
  Quadratic approximation is a way to approximate a curve. Quadratic approximation is an extension of linear approximation – we’re adding one more term, which is related to the second derivative. Linear approximation uses the first derivative to find the straight line that most closely resembles a curve at some point. Quadratic approximation uses the first and second derivatives to find the parabola closest to the curve near a point. ([Statistics How To](https://www.statisticshowto.com/quadratic-approximation/) and [MIT OpenCourseWare](https://ocw.mit.edu/courses/18-01sc-single-variable-calculus-fall-2010/pages/unit-2-applications-of-differentiation/part-a-approximation-and-curve-sketching/session-25-introduction-to-quadratic-appoximation/))
R-Forge: |
  R-Forge offers a central platform for the development of R packages, R-related software and further projects. It is based on FusionForge offering easy access to the best in SVN, daily built and checked packages, mailing lists, bug tracking, message boards/forums, site hosting, permanent file archival, full backups, and total web-based administration. (<a href="https://r-forge.r-project.org/">R-Forge Home</a>)
Random Variables: |
  A statistical term for variables that associate different numeric values with each of the possible outcomes of some random process. By random here we do not mean the colloquial use of this term to mean something that is entirely unpredictable. A random process is simply a process whose outcome cannot be perfectly known ahead of time (it may nonetheless be quite predictable). (Chap.3)
Range: |
  Range is the highest and lowest values of a variable, showing the full spread of values; the range can also be reported as a single number computed by taking the difference between the highest and lowest values of the variable. (SwR, Glossary)
Regression-Outlier: |
  A regression outlier is an observation that has an unusual value for the outcome given its value(s) of predictor(s). (SwR, Glossary)
Rejection Region: |
  Rejection region is the area under the curve of a sampling distribution where the probability of obtaining a value is very small, often below 5%; the rejection region is in the end of the tail or tails of the distribution. (SwR, Glossary)
Residually: |
  Residuals are the differences between the observed values and the predicted values. (SwR, Glossary)
R-squared: |
  R-squared is the percent of variance in a numeric variable that is explained by one or more other variables; the r-squared is also known as the coefficient of determination and is used as a measure of model fit in linear regression and an effect size in correlation analyses. (SwR, Glossary)
Samples: |
  Samples are subsets of observations from some population that is often analyzed to learn about the population sampled. (SwR, Glossary)
Sample Space: |
  The collection of all outcomes that are possible is the sample space. (Albert and Hu, 2019, p. 27)
Sampling Distribution: |
  A sampling distribution is the distribution of summary statistics, like means, from repeated samples taken from a population. (SwR, Glossary)
Scatterplot: |
  A scatterplot is a graph that shows one dot for each observation in the data set (SwR, Glossary)
ScatterplotMatrix: |
  A scatterplot matrix arranges multiple scatterplots on a grid so that they are easy to compare to one another. The matrix arrangement allows you to look at many different relationships between multiple variables in a dataset all at once, which can be very useful for exploratory data analysis. (<a href="https://unc-libraries-data.github.io/R-Open-Labs/Extras/ggally/ggally.html>R-Open-Labs</a>)
Scientific Notation: |
  Scientific notation is a way to display very large or very small numbers by multiplying the number by the value of 10 to some power to move the decimal to the left or right; for example, 1,430,000,000 could be displayed as 1.43 × 10^9 in scientific notation. (SwR, Glossary)
Sensitivity: |
  The true positive rate (one minus the false negative rate), is referred to as sensitivity, recall, or probability of detection. (Bayesian Thinking, Chap.1). But also the percentage of “yes” values or 1s a logistic regression model got right. (SwR, Glossary)
Shapiro-Wilk: |
  The Shapiro-Wilk test is a statistical test to determine or confirm whether a variable has a normal distribution; it is sensitive to small deviations from normality and not useful for sample sizes above 5,000 because it will nearly always find non-normality. (SwR, Glossary)
Sigmoid: |
  A sigmoid function is any mathematical function whose graph has a characteristic S-shaped curve or sigmoid curve. (<a href="https://en.wikipedia.org/wiki/Sigmoid_function">Wikipedia</a>)
Sign-Test: |
  Sign-test is a a statistical test that compares the median of a variable to a hypothesized or population value; used in lieu of the one-sample t-test when the t-test assumptions are not met. (SwR, Glossary)
Similar Operator: |
  The tilde ('~') in R formulae signifies a stochastic relationship. This is in contrast to the "=" symbol which means a deterministic relationship or definiton. (more details in [R Intro](https://cran.r-project.org/doc/manuals/R-intro.html#Statistical-models-in-R)), The [similar operator](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/tilde) is used to separate the left- and right-hand sides in a model [formula]. It can be pronounced as "is modelled by" or in a formula "is dependent on" or "is explained by". In Markdown it is produced with $\sim$.
Simple-Linear-Regression: |
  Simple does not mean easy; instead, it is the term used for a statistical model used to predict or explain a continuous outcome by a single predictor. (SwR, Glossary, Chap09)
Simulation: |
  Simulation is a way to model random events, such that simulated outcomes closely match real-world outcomes. By observing simulated outcomes, researchers gain insight on the real world. (<a href="https://stattrek.com/experiments/simulation#">Stat Trek</a>)
Skewness: |
  Skewness is the extent to which a variable has extreme values in one of the two tails of its distribution (SwR, Glossary)
Slope: |
  The slope is the increase in the dependent variable when the independent variable increases with one unit and all other independent variables remain the same. (<a href="https://link.springer.com/referenceworkentry/10.1007/978-94-007-0753-5_1486">Intercept, Slope in Regression</a>)
Spearman: |
  Spearman’s rho a statistical test used to examine the strength, direction, and significance of the relationship between two numeric variables when they do not meet the assumptions for [Pearson]’s r. (SwR, Glossary)
Specificity: |
  The true negative rate (one minus the false positive rate), is referred to as specificity. (Bayesian Thinking, Chap.1) But also the percentage of “no” values or 0s a logistic regression model predicted correctly. (SwR, Glossary)
SSP: |
  SSP stands for Syringe Services Program (SwR)
Statista: |
  Statista is a global data and business intelligence platform with an extensive collection of statistics, reports, and insights on over 80,000 topics from 22,500 sources in 170 industries. Established in Germany in 2007, Statista operates in 13 locations worldwide and employs around 1,100 professionals. (<a href="https://www.statista.com/aboutus/">statista</a>)
Stacked Bar Chart: |
  A stacked bar chart is a data visualization that shows parts of a whole in a bar chart format; this type of chart can be used to examine two categorical variables together by showing the categories of one variable as the bars and the categories of the other variable as different colors within each bar. (SwR, Glossary)
StackOverflow: |
  Stack Overflow is a question-and-answer website for computer programmers. (<a href="https://en.wikipedia.org/wiki/Stack_Overflow">Wikipedia</a>)
Stan Platform: |
  Stan is a state-of-the-art platform for statistical modeling and high-performance statistical computation. Stan interfaces with the most popular data analysis languages (R, Python, shell, MATLAB, Julia, Stata) and runs on all major platforms (Linux, Mac, Windows). Users specify log density functions in Stan’s probabilistic programming language and get (a) full Bayesian statistical inference with MCMC sampling (NUTS, HMC), (b) approximate Bayesian inference with variational inference (ADVI), (c) penalized maximum likelihood estimation with optimization (L-BFGS). Stan is named in honor of Stanislaw Ulam (1909-1984), co-inventor of the Monte Carlo method. (<a href="https://mc-stan.org/">STAN website</a> and <a href="https://www.r-bloggers.com/2019/01/an-introduction-to-stan-with-r/">R-Bloggers</a>)
Standard Deviation: |
  The standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is the square root of its variance. A useful property of the standard deviation is that, unlike the variance, it is expressed in the same unit as the data. Standard deviation may be abbreviated SD, and is most commonly represented in mathematical texts and equations by the lower case Greek letter $\sigma$ (sigma), for the population standard deviation, or the Latin letter $s$ for the sample standard deviation. ([Wikipedia](https://en.wikipedia.org/wiki/Standard_deviation))
Standard Error: |
  The standard error (SE) of a statistic is the standard deviation of its [sampling distribution]. If the statistic is the sample mean, it is called the standard error of the mean (SEM). (<a href="https://en.wikipedia.org/wiki/Standard_error">Wikipedia</a>) The standard error is a measure of variability that estimates how much variability there is in a population based on the variability in the sample and the size of the sample. (SwR, Glossary)
Standardization: |
  In statistics, standardization (also called Normalizing) is the process of putting different variables on the same scale. This process allows you to compare scores between different types of variables. Typically, to standardize variables, you calculate the mean and standard deviation for a variable. Then, for each observed value of the variable, you subtract the mean and divide by the standard deviation. ([Statistics by Jim](https://statisticsbyjim.com/glossary/standardization/)) See `scale()` in R.  (Chap.4)
Standardized Residuals: |
  Standardized residuals are the standardized differences between observed and expected values in a chi-squared analysis; a large standardized residual indicates that the observed and expected values were very different. (SwR, Glossary)
Statistical Model: |
  A statistical model is an expression that attempts to explain patterns in the observed values of a response variable by relating the response variable to a set of predictor variables and parameters. ([Monash University](https://users.monash.edu.au/~murray/stats/BIO4200/LinearModels.pdf))Statistical models are mappings of one set of variables through a probability distribution onto another set of variables. Fundamentally, these models define the ways values of some variables can arise, given values of other variables, because it can be quite hard to anticipate how priors influence the observable variables. (Chap.4)
Statistically Significant: |
  The result of a statistical test that indicates the test statistic is unlikely to have been as large as it was if the null hypothesis were true is called statistically significant. This does not necessarily mean the differences are important or practically significant, just that they are bigger than what would most likely have happened if there were no relationship in the population between the variables involved. (SwR Glossary and Chap. 5)
Statistics Discipline: |
  Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using, and inferential statistics, which draw conclusions from data.
Stochastic: |
  A stochastic relationship is a mapping of a variable or parameter onto a distribution. It is said to be "stochastic" because no single instance of the variable on the left of the equation is known with certainty. Instead, the mapping is probabilistic: Some values are more plausible than others, but very many different values are plausible under any model. (Chap.4)
Student: |
  Student t-test is a statistical test used to test whether the difference between the response of two groups is statistically significant or not. (<a href="https://en.wikipedia.org/wiki/Student%27s_t-test">Wikipedia</a>)
SwR: |
  SwR is my abbreviation of: Harris, J. K. (2020). Statistics With R: Solving Problems Using Real-World Data (Illustrated Edition). SAGE Publications, Inc.
Table 1: |
  Descriptive statistics are often displayed in the first table in a published article or report and are therefore often called <i>Table 1 statistics</i> or the <i>descriptives</i>. (SwR)
Tail: |
  The values to the far right and far left of a distribution are called the 'tail' of the distribution. (SwR, GLossary)
T-Statistic: |
  The T-Statistic is used in a T test when you are deciding if you should support or reject the null hypothesis. It’s very similar to a Z-score and you use it in the same way: find a cut off point, find your t score, and compare the two. You use the t statistic when you have a small sample size, or if you don’t know the population standard deviation. (<a href="https://www.statisticshowto.com/t-statistic/">Statistics How-To</a>)
T-Test: |
  A t-test is a type of statistical analysis used to compare the averages of two groups and determine whether the differences between them are more likely to arise from random chance. (<a href="https://en.wikipedia.org/wiki/Student%27s_t-test">Wikipedia</a>)
Tibble: |
  A tibble or tbl_df, is a modern reimagining of the data.frame. It never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates row names. It’s possible for a tibble to have column names that are not valid R variable names, aka non-syntactic names. Tibbles have a refined print method that shows only the first 10 rows, and all the columns that fit on screen. (<a href="https://r4ds.had.co.nz/tibbles.html">R4DS</a>)
Tilde Operator: |
  The tilde ('~') in R formulae signifies a stochastic relationship. This is in contrast to the "=" symbol which means a deterministic relationship or definiton. (more details in [R Intro](https://cran.r-project.org/doc/manuals/R-intro.html#Statistical-models-in-R)), The [similar operator](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/tilde) is used to separate the left- and right-hand sides in a model [formula]. It can be pronounced as "is modelled by" or in a formula "is dependent on" or "is explained by". In Markdown it is produced with $\sim$.
Tile Plots: |
  Tile plots are matrices of tiles. For each tile, either the "width", "height", "area", or squared area is proportional to the corresponding entry. (Help file of `vcd::tile()`)
Trace Rank Plot: |
  See: Trank Plot.
Trace Plot: |
  A trace plot is a chain visualization that plots the samples in sequential order, joined by a line. A trace plot isn’t the last th
Trank Plot: |
  Trace Rank Plot or as McElreath's suggest a Trank Plot visualizes the chains as a distribution of the ranked samples. What this means is to take all the samples for each individual parameter and rank them. The lowest sample gets rank 1. The largest gets the maximum rank (the number of samples across all chains). Then we draw a histogram of these ranks for each individual chain. Why do this? Because if the chains are exploring the same space efficiently, the histograms should be similar to one another and largely overlapping. (Chap.9)
Transgender: |
  Transgender people are people whose biological sex is not consistent with their gender. Contrast: [Cisgender] (SwR, Chapter 2)
Trend Line: |
  A trend line is a line that follows the relationship between variables in a graph, sometimes called the line of best fit (SwR, Glossary).
Tukey: |
  Tukey’s Honestly Significant Difference (HSD) is a post hoc test to determine which means are statistically significantly different from each other following a significant ANOVA result; Tukey’s HSD compares each pair of means and so is considered a pairwise test, but it is less conservative than the [Bonferroni] post hoc test. (SwR, Glossary)
Two-way: |
  Two-way ANOVA is an analysis of variance (ANOVA) with the means of a numeric variable compared across the categories of two categorical predictors. (SwR, Glossary)
Type-1: |
  Rejecting the null hypothesis when it should be retained is called Type I error or alpha and used as the threshold to determine statistical significance. (SwR, Glossary)
Type-2: |
  Retaining the null hypothesis when it should be rejected is calles Type II error or beta. (SwR, Glossary)
UCR: |
  The Uniform Crime Reporting (UCR) Program generates reliable statistics for use in law enforcement. It also provides information for students of criminal justice, researchers, the media, and the public. The program has been providing crime statistics since 1930. The UCR Program includes data from more than 18,000 city, university and college, county, state, tribal, and federal law enforcement agencies. (<a href="fbi.gov/services/cjis/ucr">UCR</a>)
Unexplained variance: |
  Unexplained variance is variability in the outcome that is not explained by the predictor(s); the unexplained variability and explained variability are used in calculations for model significance and model fit. (SwR, Glossary)
Uniform Crime Reporting: |
  The Uniform Crime Reporting (UCR) Program generates reliable statistics for use in law enforcement. It also provides information for students of criminal justice, researchers, the media, and the public. The program has been providing crime statistics since 1930. The UCR Program includes data from more than 18,000 city, university and college, county, state, tribal, and federal law enforcement agencies. (<a href="fbi.gov/services/cjis/ucr">UCR</a>)
Variance var: |
  Variance is the squared deviation from the mean of a random variable. The variance is also often defined as the square of the standard deviation. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by `σ`, `σ^2`, VAR(x), var(x) or V(x). ([Wikipedia](https://en.wikipedia.org/wiki/Variance))
VIF: |
  The variance inflation factor (VIF) is a statistic for determining whether there is problematic multicollinearity in a linear regression model. (SwR, Glossary)
Vignette: |
  A vignette is a long-form guide to your package. (Chapter <a href="https://r-pkgs.org/vignettes.html">Vignettes</a> in R Packages 2e)
Violin Plots: |
  Violing plots are visual displays of data that combine features of density plots and boxplots to show the distribution of numeric variables, often across groups. (SwR, Glossary)
Waffle Charts: |
  Waffle Charts are visual displays of data that show the parts of a whole similar to a pie chart; waffle charts are generally preferred over pie charts. (SwR, Glossary)
Wald: |
  Wald test is the statistical test for comparing the value of the coefficient in linear or logistic regression to the hypothesized value of zero; the form is similar to a one-sample t-test, although some Wald tests use a t-statistic and others use a z-statistic as the test statistic. (SwR, Glossary)
Welch-T: |
  Welch’s t-test is a variation on the Student’s t-test that does not assume equal variances in group (SwR, Glossary).
Welch-F: |
  Welch’s F-statistic is an alternate F-statistic used in analysis of variance when the assumption of homogeneity of variance is not met; the calculations for the Welch’s F-statistic use weights to calculate the group means and the grand mean. (SwR, Glossary)
WID: |
  The 'World Inequality Database' (WID) aims to provide open and convenient access to the most extensive available database on the historical evolution of the world distribution of income and wealth, both within countries and between countries. [WID.WORLD](https://wid.world/wid-world/)
WIL: |
  The 'World Inequality Lab (WIL) is a global research center focused on the study of inequality and public policies that promote social, economic and environmental justice. ([WIL](https://inequalitylab.world/en/our-mission/))
Wilcoxon: |
  Wilcoxon signed-ranks test is an alternative to the dependent-samples t-test when the continuous variable is not normally distributed; it uses ranks to determine whether the values of a numeric variable are different across two related groups. (SwR, Glossary)
WLS: |
  Weighted Least Squares is an extension of Ordinary Least Squares regression. Non-negative constants (weights) are attached to data points. It is used when the data violates the assumption of homoscedasticity; when you want to concentrate on certain areas; when you are running a logistic regression or any other procedure where data points should not be treated equally. (<a href="https://www.statisticshowto.com/weighted-least-squares/">Statistics How-To</a>)
XML: |
  Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications —all of them free open standards—define XML. The design goals of XML emphasize simplicity, generality, and usability across the Internet. (<a href="https://en.wikipedia.org/wiki/XML">Wikipedia</a>)
xpt: |
  .xpt is a file extension indicating that the file is a transport file from the SAS (Statistical Analysis System) statistical software program (SwR, Glossary)
Yates continuity correction: |
  Yates continuity correction is a correction for chi-squared that subtracts .5 from the difference between observed and expected in each cell, making the chi-squared value smaller and statistical significance harder to reach; it is often used when there are few observations in one or more of the cells. This correction is also used when both variables have just two categories because the chi-squared distribution is not a perfect representation of the distribution of differences between observed and expected of a chi-squared in the situation where both variables are binary. (SwR, Glossary and Chap. 5)
Y-hat: |
  The term 'y hat' (written as ŷ) refers to the estimated value of a response variable in a linear regression model.(<a href="https://www.statology.org/y-hat/">What is Y Hat in Statistics?</a>)
Z-distribution: |
  The z-distribution is a normal distribution with a mean of 0 and a standard deviation of 1. (SwR, chap10)
Z-score: |
  A z-score (also called a standard score) gives you an idea of how far from the mean a data point is. But more technically it’s a measure of how many standard deviations below or above the population mean a raw score is. (<a href="https://www.statisticshowto.com/probability-and-statistics/z-score/#Whatisazscore">StatisticsHowTo</a>)