-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minerar os dados da proposta de cada candidato à presidente #4
Comments
In order to create a candidate's dataset, it's necessary to collect/mining data about the candidates and then provide the data. This algorithm was developed only to provide the candidate's data to another code and then create the dataset. The algorithm is not completed yet, it's possible to have some changes. See also: #4, #3
update:
|
update: |
Add a new code to create a dataset for content-based filtering, The tf-idf's metrics was implemented to improve candidate's result. See also: #4
Situação final: Dois algoritmos foram desenvolvidos a partir da mineração das propostas do candidato. O primeiro retorna o dicionário do candidato com base na contagem de termos das áreas citadas, e o segundo retorna o dicionário do candidato com a métrica TF/IDF. Será feito uma algoritmo de recomendação que receberá os dados de |
As propostas de cada candidato estão disponíveis em um artigo web no site da globo. Deve-se minerar os dados de cada pdf dos candidatos e verificar a quantidade de citação de cada área (economia, saúde, tecnologia, etc).
The text was updated successfully, but these errors were encountered: