In [1]:
# COMBINING PANDAS WITH APIs AND STATISTICS
# With all your experience in quantitative analysis, you decide to start your own wealth management firm.
# Your first client is a family member who wants you to evaluate their long-term retirement goals.
# Your initial analysis indicates that their portfolio might generate outsized returns that will enable retirement in five years.
# However, a chance still exists that the market will collapse, preventing early retirement.
# How'd you measure the likelihood, or probability, of that uncertain future scenaria?
# We can use the Pandas library and data that we get via APIs to make predictions about the future performance of a portfolio.
# However, the fact that no single outcome exists complicates this process.
# Rather, a range of possible outcomes are more likely to occur - that is, they're more probable - than others.
# In this lesson, you'll measure the probability of financial outcomes by combining Pandas functions, APIs, and statistical concepts.
# Applying statistics - like probabilities, probability distributions, and confidence intervals - to portfolio analysis will improve how you manage portfolios and make financial decisions.

In [2]:
# USING PROBABILITY TO MAKE FINANCIAL PROJECTIONS
# PROBABILITY is a concept in statistics that describes the chance that a particular event will happen.
# In finance, we associate probabilty with the movement of stock prices.
# A stock price can increase or decrease, and the magnitude of that changes also varies.
# The change in stock price always remains uncertain. 
# However, by analyzing that past movements in a stock price, we can estimate the likely change in a stock price both near-term and long-term future.
# In this section, we'll explore the probability and related statistical concepts that help us make predictions about the movement in an asset's price.
# First, we'll discuss the probability distribution, which is key to estimating possible future outcomes of stock and portfolio performance.

In [3]:
# WHAT'S THE PROBABILITY DISTRIBUTION
# PROBABILITY DISTRIBUTION is a mathematical function that allows us to visualize the likelihood of probable outcomes for a particular event.
# The most-common probability distribution is the normal distribution.
# Probability distributions such as the normal distribution help us make educated guesses about what might happen to the performance of a stock, bond, or portfolio in the future, especially the near-term future.

# WHAT'S NORMAL DISTRIBUTION?
# Various real-world scenarios, from test scores to the daily returns of a stock over time, use the normal distribution.
# People commonly refer to NORMAL DISTRIBUTION as the bell curve.
# It describes a dataset where values that lie further from the mean occur less frequently than those that lie closer to the mean.
# In a normal distribution of numerical data, the probability that a data point equals a particular value follows the 68-95-99.7 RULE.
# This rule states that 68%, 95%, and 99.7% (effectively 100%) of possible values lie within one, two, and three standard deviations of the mean.
# The standard deviation defines the x-axis of the probability distribution plot.
# The standard deviation for a dataset defines its probability distribution.
# We associate a standard deviation of zero with the mean of the dataset.
# The values of the standard deviations correspond to how the data spreads out around the mean value.
# Normal distributions prove particularly useful in finance, because they adequately approximate the volatility of the daily return values for stock prices and other financial assets.
# For example, the daily return value from a high-volatility stock, such as Tesla, and from a low-volatility stock, such as Coca Cola, can both demonstrate normal distributions.
# This happens despite the differences in company size, customer base, stock price, and market share.
# We can calculate and visualize normal distributions of such stocks by using Pandas.

In [None]:
# IMPORTANT:
# The mean of a dataset is a measure of central tendency that describes the center of a dataset.
# The standard deviation of a dataset measures ow the data spreads out around the mean value.
# While the mean is a measure of central tendency, the standard deviation is a MEASURE OF DISPERSION.
# In Pandas, you can find both of these values by using the `describe` function.