Skip to content
Automated machine learning company report in an interactive 'PDF style' from four dimensions: employees, customers, shareholders (owners) and management.
Branch: master
Clone or download
Latest commit dc410df Mar 18, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cache Update Mar 18, 2019
data Update Mar 18, 2019
extra_css Update Mar 18, 2019
files Update Mar 18, 2019
functionality Update Mar 18, 2019
layout Update Mar 18, 2019
misc Update Mar 18, 2019
processing Update Mar 18, 2019
statics Update Mar 18, 2019
._.DS_Store Update Mar 18, 2019
._Procfile Update Mar 18, 2019
._Procfile.txt Update Mar 18, 2019
._app.yaml
._empty_keep.txt Update Mar 18, 2019
._extra_css
._input_fields.csv Update Mar 18, 2019
._requirements.txt Update Mar 18, 2019
._requirements_new.txt Update Mar 18, 2019
._runtime.txt Update Mar 18, 2019
._this.css
.dockerignore Update Mar 18, 2019
.gitattributes Update Mar 18, 2019
Dockerfile Update Mar 18, 2019
Procfile
Procfile.txt Update Mar 18, 2019
READDDME.md Update Mar 18, 2019
README.md Update Mar 18, 2019
app.yaml Update Mar 18, 2019
empty_keep.txt Update Mar 18, 2019
input_fields.csv
main.py Update Mar 18, 2019
requirements.txt Update Mar 18, 2019
requirements_new.txt Update Mar 18, 2019
runtime.txt Update Mar 18, 2019
settings.yaml Update Mar 18, 2019
setup.py Update Mar 18, 2019
this.css

README.md

FirmAI Report

For a sampled version of the report (webapp) see FirmAI Report.

This report endeavours to provide ratings of four corporate dimensions: employees, customers, shareholders and management, as benchmarked against competitors. It also shows the change in ratings over time. In a final step, a machine learning model compares all the metrics (about 80) with company valuations to establishes whether a firm is under or over-valued. It most notably predicted that BJ's Restaurants were significantly undervalued at the end of 2017, within 6 months the stock price doubled. If you look at the chart, which shows the portfolio performance of $100 (not the stock price) over five years, the light blue line is the ML valuation, and the dark blue line is the real market value.

This report consists of Programmatic Competitor Analysis, NLP Sentiment Analysis, NLP Summarisation, ML Time Series and Cross-Section Prediction (Valuation, Closures, Geographic Opportunity), Employee Growth and Qualifications Measures, Location Ratings, Rating Growth, Social Media Analytics, Compensation Satisfaction Analysis, Interview Analysis, Product Analysis and Financial PCA. It is my hope that this report, analysis, generated data and scraping scripts (in functionality folder), will benefit smaller firms who do not necessarily have access to this technology stack.

Overview

Description

The report is built out of a Dash example. It is fully automated and updates on a monthly basis. It allows companies to study multiple competitors and company locations without strenuous user input. It is the first interactive report of its kind. It is in PDF style, making it easily digestible and also easy to print for meetings.

All information is extracted from the public domain using modern programming tools. This report uses state of the art machine learning and natural language processing techniques for deep sentiment analysis and prediction tasks. The report looks analysis a company’s from four dimensions, being the employees, customers, shareholders (owners) and management. Information is gathered from numerous online sources, the majority of which do not sit behind pay-walls. This report serves the following functions.

  • Identify the overall sentiment of your firm on the before-mentioned dimensions.
  • Identify the extent to which your firm is currently under or overvalued as per qualitative and quantitative metrics using machine learning.
  • Compare the valuation of your firm against that of close competitors, and programatically identify close competitors.
  • Get an overview as to which locations are the most and least at risk of closing using inbuilt machine learning tools.
  • Get to understand the different attributes leading to higher customer satisfaction.
  • Get an indication as to how well the company has done by following various metrics over time.
  • Gain a deeper insight into how your employee and management cohort compares against industry benchmarks.
  • Isolate competitor firms using five different algorithmic benchmarks.
  • Identify the relationship between firm value and three machine learning satisfaction ratings (employee, customer and manager satisfaction).
  • Identify the top employment regions historically and more recently by analysing open job locations.
  • Look at different positive and negative sentiment summaries from employees and customers as identified with natural language processing tools.
  • Get to know the composition of employees such as their level of qualifications, skill and their hierarchical position across different benchmarks.
  • Identify the level of employee growth among competitors.
  • Understand employee's level of satisfaction with their compensation packages.
  • Survey the surroundings to understand the geographic competitiveness.
  • Explore the difference in ratings across states and counties.
  • Get an understanding of the sentiment as it relates to different categories.
  • Identify some of the key financial metrics and patterns leading to company success.
  • Compare competitor's website and social media stats.
  • Get an understanding of each firm's online footprint and how it changes over time.
  • Get an overall rating of the firm at present and historically to gauge possible future rating changes.
  • Gain a better understanding of customers both locally and nationally.
  • Obtain a better understanding of the interview process and other details.
  • Identify competitor's top products and categorical prices.

Report

Development

The report will grow dynamically over time and eventually become more prescriptive in nature.

  • In the future the report would attempt to predict prospective revenue and identify the portion of revenue generated from each location.
  • Furthermore, the different level of overall firm financial health would be estimated using machine learning techniques.
  • A further procedure would include the analysis of firm financial filings and financial statement readability along with anomaly detection.
  • A further 30 novel databases are to be compiled to estimate the level of corporate social responsibility of each firm.
  • Finally, the creation of an improved valuation model for firms that are not publicly traded and the addition of causal analysis.
  • Any additional forms of analysis as requested by the client. It is likely that for a more granular exploration would require internal data.

Running Your Own

  • Download Repository
  • Run scrapers with setup.py (only if you want to generate new data)
  • Install dependencies in requirements.txt
  • Run main.py
  • Note, this repository is big (4GB), it already contains data
You can’t perform that action at this time.