Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update naive-bayes.md #1656

Merged
merged 13 commits into from Mar 1, 2021
Merged

Update naive-bayes.md #1656

merged 13 commits into from Mar 1, 2021

Conversation

particle1331
Copy link
Contributor

Description of changes:

There's a mistake in the formula for calculating naive bayes estimates.
P[x_i, y] is not defined. Only P[i, y] is defined which is the conditional probability p(x_i = 1 | y).

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

@mli
Copy link
Member

mli commented Feb 4, 2021

Job d2l-en/PR-1656/1 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@goldmermaid
Copy link
Member

goldmermaid commented Feb 4, 2021

Hey @particle1331 , great catch! I like the way you describe the formula, while it seems to be not a legitimate one... I reconstruct the math and text in the following way:

image

If it makes sense to you, please update this PR as is.

(Attached the latex below.)

If we can estimate $\prod_i p(x_i=1  \mid  y)$ for every $i$ and $y$, and save its value in $P_{xy}[i, y]$, here $P_{xy}$ is a $d\times n$ matrix with $n$ being the number of classes and $y\in\{1, \ldots, n\}$, i.e.,


$$ 
P_{xy}[i, y] = 
\begin{cases}
    P_{xy}[i, y] & \text{for } t_i=1 ;\\
    1 - P_{xy}[i, y] & \text{for } t_i = 0 .
\end{cases}
$$

In addition, we estimate $p(y)$ for every $y$ and save it in $P_y[y]$, with $P_y$ a $n$-length vector. Then for any new example $\mathbf x$, we could compute

$$ 
\begin{aligned}
\hat{y} 
&= \mathrm{argmax}_y \ \prod_{i=1}^d P_y[y] \times P_{xy}[i, y]\\
&= \mathrm{argmax}_y \ \prod_{i=1}^d P_y[y]\ P_{xy}[i, y]^{t_i}\, \left(1 - P_{xy}[i, y]\right)^{1-t_i},
\end{aligned}
$$
:eqlabel:`eq_naive_bayes_estimation`


for any $y$. So our assumption of conditional independence has taken the complexity of our model from an exponential dependence on the number of features $\mathcal{O}(2^dn)$ to a linear dependence, which is $\mathcal{O}(dn)$.

@mli
Copy link
Member

mli commented Feb 4, 2021

Job d2l-en/PR-1656/2 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@particle1331
Copy link
Contributor Author

particle1331 commented Feb 5, 2021

@goldmermaid This looks awesome! Will do, thanks.

EDIT: I looked more closely, shouldn't it be what is shown here:

Screen Shot 2021-02-05 at 10 17 57 AM

Here I changed

If we can estimate $\prod_i p(x_i=1 \mid y)$ for every $i$ and $y$, and save its value in $P_{xy}[i, y]$, here $P_{xy}$ is a $d\times n$ matrix with $n$ being the number of classes and $y\in{1, \ldots, n}$, i.e.,

to

If we can estimate $p(x_i=1 \mid y)$ for every $i$ and $y$, and save its value in $P_{xy}[i, y]$, here $P_{xy}$ is a $d\times n$ matrix with $n$ being the number of classes and $y\in{1, \ldots, n}$, i.e.,

along with introducing $\bold t = (t_1, \ldots, t_d)$ for the test vector.

@mli
Copy link
Member

mli commented Feb 5, 2021

Job d2l-en/PR-1656/3 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@goldmermaid
Copy link
Member

LGMT! Thanks for your contribution @particle1331 !

@particle1331
Copy link
Contributor Author

particle1331 commented Feb 5, 2021

@goldmermaid Hi. I'm sorry I wasn't able to push the changes outlined in the comment last night -- I wanted to know first if it works for you. Please check the recent changes if you have time. Thanks!

EDIT: One problem that I can't solve is the :eqlabel:eq_naive_bayes_estimation at the end:

Screen Shot 2021-02-05 at 8 04 01 PM

@mli
Copy link
Member

mli commented Feb 5, 2021

Job d2l-en/PR-1656/4 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Feb 5, 2021

Job d2l-en/PR-1656/5 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Feb 5, 2021

Job d2l-en/PR-1656/6 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Feb 5, 2021

Job d2l-en/PR-1656/7 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

\hat{y}
&= \mathrm{argmax}_y \ p(y)\prod_{i=1}^d p(x_t = t_i \mid y)\\
&= \mathrm{argmax}_y \ P_y[y]\prod_{i=1}^d \ P_{xy}[i, y]^{t_i}\, \left(1 - P_{xy}[i, y]\right)^{1-t_i}
\end{aligned}$$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please seperate the "$$" to a new line. Or the ":eqlabel:eq_naive_bayes_estimation" won't render properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e.,

$$
\begin{aligned}

...

\end{aligned}
$$
:eqlabel:`eq_naive_bayes_estimation`

@mli
Copy link
Member

mli commented Feb 7, 2021

Job d2l-en/PR-1656/9 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Feb 7, 2021

Job d2l-en/PR-1656/10 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@astonzhang
Copy link
Member

@goldmermaid does this PR look good to you now?

Comment on lines 184 to 190
$$
\begin{aligned}
\hat{y}
&= \mathrm{argmax}_y \ \prod_{i=1}^d P_y[y] \times P_{xy}[i, y]\\
&= \mathrm{argmax}_y \ \prod_{i=1}^d P_y[y]\ P_{xy}[i, y]^{t_i}\, \left(1 - P_{xy}[i, y]\right)^{1-t_i},
\end{aligned}
$$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @particle1331 , apology for a bit delay. Here is the reason why this "eqlabel" is not render well:

"$$\begin{aligned} ... \end{aligned}$$" needs to be on one single line.

Please check (search \begin{aligned}) in this example: https://github.com/d2l-ai/d2l-en/blob/master/chapter_linear-networks/linear-regression.md.

Let me know if you are still having trouble solving it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Sorry for the delay as well. Thank you for pointing me towards this solution.
It worked. :=)

http://preview.d2l.ai/d2l-en/PR-1656/chapter_appendix-mathematics-for-deep-learning/naive-bayes.html

@mli
Copy link
Member

mli commented Mar 1, 2021

Job d2l-en/PR-1656/11 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Mar 1, 2021

Job d2l-en/PR-1656/12 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@mli
Copy link
Member

mli commented Mar 1, 2021

Job d2l-en/PR-1656/13 is complete.
Check the results at http://preview.d2l.ai/d2l-en/PR-1656/

@goldmermaid
Copy link
Member

LGTM! Fantastic work @particle1331 !

@astonzhang astonzhang merged commit 81f58f2 into d2l-ai:master Mar 1, 2021
@particle1331 particle1331 deleted the patch-4 branch March 1, 2021 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants