We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http://ailearning.apachecn.org/ml/4.NaiveBayesian/
ApacheCN 专注于优秀项目维护的开源组织
The text was updated successfully, but these errors were encountered:
我们现在用 p1(x,y) 表示数据点 (x,y) 属于类别 1(图中用圆点表示的类别)的概率,用 p2(x,y) 表示数据点 (x,y) 属于类别 2(图中三角形表示的类别)的概率,那么对于一个新数据点 (x,y),可以用下面的规则来判断它的类别:
如果 p1(x,y) > p2(x,y) ,那么类别为1 如果 p2(x,y) > p1(x,y) ,那么类别为2
这是不是写错了?
如果 newPoint(x,y) > p2(x,y) ,那么类别为1 如果 newPoint(x,y) > p1(x,y) ,那么类别为2
Sorry, something went wrong.
没写错。是计算每个点在2个分类中的概率,谁大就属于谁
第一个例子的 spamTest() 函数中最后使用了词集模型来统计
for docIndex in testSet: wordVector = setOfWords2Vec(vocabList, docList[docIndex])
但是计算概率的时候的分母是不是用了词袋模型的分母呢,把所有词出现的次数都加起来了
for i in range(numTrainDocs): if trainCategory[i] == 1: # 累加辱骂词的频次 p1Num += trainMatrix[i] # 对每篇文章的辱骂的频次 进行统计汇总 p1Denom += sum(trainMatrix[i]) else: p0Num += trainMatrix[i] p0Denom += sum(trainMatrix[i])
如果是词集模型的话,分母不应该是 p1Denom += 1 和 p0Denom += 1 吗
No branches or pull requests
http://ailearning.apachecn.org/ml/4.NaiveBayesian/
ApacheCN 专注于优秀项目维护的开源组织
The text was updated successfully, but these errors were encountered: