CSE 442
1. Linear Regression:
A way to predict based on known data.
For example, you could use linear regression to predict how much money
you will make in the future based on how much money you made in the past.
How about how much your property may be worth?
2. Logistic Regression:
a type of statistical analysis that is used to predict the probability
of an event occurring. It is a type of analysis that is used when the
dependent variable is binary (0 or 1, yes or no).
3. Support Vector Machines:
a model that can learn from examples and make predictions.
It is often used to classify things into groups.
4. Decision Trees:
an approach to help you make a decision by revealing possible options.
You can choose the a preferred option by looking at what is presented.
5. Random Forests:
use it to predict things. It works by looking at a bunch of different
scenarios that could affect the thing you are trying to predict;
then, it makes a guess based on what it has learned.
6. Gradient Boosting:
a technique combining multiple weaker models to create a stronger one.
The weaker models are developed using a gradient descent algorithm,
and the final model is a weighted combination of all the weaker (in comparison) models.
7. Neural Networks:
a machine learning algorithm that is used to model complex patterns in data.
Neural networks are like other machine learning algorithms, but they are composed
of a large number of interconnected processing nodes, or neurons, that can learn
to recognize patterns of input data.
8. Principal Component Analysis (PCA):
a technique used to find patterns in data.
It looks at the data and finds the direction that the data varies the most.
9. Linear Discriminant Analysis:
a machine learning technique that helps identify a set of variables (features)
most important for predicting a target variable. LDA is a way of analyzing data
so that it can be used to predict the outcomes of actions. It is used to identify
relationships between different values in data, and to then use these relationships
to make predictions about the future.
10. K-Means Clustering:
a technique used in machine learning to group data together so that the data are
more likely to be related to one another. It is an approach to assist groups of
data points (e.g., items in a database) by finding their closest counterparts and
grouping them together.
11. Hierarchical Clustering:
is a way of grouping data items together to make it easier to understand.
It works by dividing the data into groups and then looking at how the groups are related.
It is an approach to grouping data points together in a hierarchy. The algorithm starts
with each data point in its own group and then combines the closest groups until there
is only one group left.
12. DBSCAN:
an algorithm that can be used to cluster data points together.
It works by looking at the density of data points and grouping them if they are close together.
13. Gaussian Mixture Models:
it uses a mixture of linear and nonlinear models to predict outcomes.
It is a type of machine learning model that helps predict the behavior of a group of objects.
The model takes in a set of input data points and uses it to predict the behavior of a
new set of input data points.
14. Autoencoders:
a machine learning algorithm that can learn to decode or reconstruct a sequence
of symbols from a set of input data. It is a type of neural network that is used
to learn how to compress data. The aim is to learn a representation (encoding) [3]
for the data that is smaller than the original data (while still containing all the important information).
15. Isolation Forest:
use it to detect outliers in data. It works by randomly selecting data points
and creating a decision tree. If the point is an outlier, it will be easier
to isolate from the rest of the data.
16. One-Class SVM:
like the isolation forest approach, this can be used to find outliers:
the evaluation to find the outlier is to create a line that best separate the data
into two groups. Any data point that is far from this line is considered an outlier.
17. Locally Linear Embedding:
a technique used to reduce the dimensionality of data. It does this by finding a
linear representation of the data that is close to the original data. It is a way
of representing a data set as a sequence of points in space. This way, you can easier
see the relationships between the data points and make better predictions.
18. t-SNE [1]:
helps to visualize data by reducing the dimensionality of the data. t-SNE works
by creating a map of the data points and then finding the best way to represent
those points in a lower dimensional space.
19. Independent Component Analysis (ICA):
used to find hidden patterns in data. It does this by looking at the relationships
between different variables in the data. It is a technique for separating out the
different parts of a signal that are mixed.
20. Factor Analysis:
used to reduce the amount of data that needs to be analyzed to find patterns.
It does this by identifying groups of data elements that have similar behaviors.
Also, it is used to reduce the amount of data that needs to be analyzed to reveal patterns;
it achieves this by identifying groups of data elements with similar behaviors.
Effectively, it is a method used to understand which characteristics of a dataset
are essential to predict an outcome.
