Skip to content

davistanugraha/tf-Financial-Sentiment-Analyzer

Repository files navigation

Financial-Sentiment-Analyzer

Purpose:

This project's purpose is to aid in quantifying financial qualitative data.We are building a sentence classifier that classify sentences of financial context to their respective sentiment (positive, negative,neutral). Realized that there a lot of publicly available sentiment analyzers, but they are trained on general sentences, and would often misclassify these subject matter sentences. For example, liability might not be bad in a financial context as they are actually indicators in the financial world. We are building a tool which is able to consider both the semantic and subject matter properties of a sentence. Hence, "financial tagging" would capture the financial context of the sentence while the "vector embedding" would aid in capturing the "semantic" properties.

Use case:

When building this project, the ideal use case in mind was for quantitative traders to be able to quickly process huge amounts of qualitative signals that the market presents and convert it to numerical data that would be used as an input to their existing quant models in a short period of time. This will let them to respond quickly to market sigals, giving them an edge.

2 methods:

1.Financial Tagging + Sentence2Vector embedding + classical downstream classifiers

2.Financial Tagging + Word2Vec2vector embedding + convolutional neural network classifier

Using this repo:

There are 3 main parts(each notebook represents one part):

1. Tagging sentences with financial lexicons + Training word2vec and sentence2vec embedding:
      1.0 Parse through real financial earnings call using a pdfparser(built this inhouse, can be found under ml_models,p.s:     yes i know its not an ml_model should have created a file called services, will improve file management later)
      1.1 break down each call to sentences and replace words that belong to a lexicon to its title(all the lexicons can be found under """resources/tagging_lexicons""", function to do this is embedded in ml_models/apriori.py within the Apriori class) 
      1.2 Feed them to a word2vec model for training(unsupervised),also built a class sent2vec(can be found in ml_models)
      that have preprocessing functions to "clean" sentences, train word2vec models, and form sentence2vectors by aggregating and averaging these word vectors
      
2. sentence2vec + classical downstream classifiers
    2.1 convert sentences to vectors utilizing the model that we trained in section 1. 
    2.2 feed it to classical ml models

3. word2vec + CNN
    2.1 convert each sentences to a 2d tensor(we fix dimensions of the tensor hence we cut off sentences) utilizing the model andfunctions that we have built and train in sentence2vec class in step 1.
    2.2 construct a 1 layer neural network with 5 different size convolution filters, one drop out, one full layer.
    clear illustraion of netwrok architecture could be found here : https://www.researchgate.net/figure/Illustration-of-our-CNN-model-for-sentiment-analysis-Given-a-sequence-of-d-dimension_fig3_321259272

sidenote:you will observe that some of the classes/functions that play primary roles are built in-house and they do not come from open source libraries, you can find all the functions, classes and their documentation wthin ml_models if you wish to do so.

About

classifying financial sentences by building a model which consider both financial and semantic context.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published