- Topics: course overview, git bash, python config.ini files, conda virtual environments
- Technology: git bash, configparser, conda
- Homework: use the command line to search data among 1000's of server configuration files
- Topics: automate the process to collect data from https://www.annualreports.com
- Technology: requests, Jupyter Notebooks, BeautifulSoup, Scrapy
- Homework: automate the process to identify and download company 10-K annual reports
- Topics: use sqlalchemy to create and populate a database, locally and on AWS
- Technology: sqlalchemy, sqllite, AWS RDS (MySQL)
- Homework: create and populate a database with sqlalchemy
- Topics: use docx to extract text from Microsoft Word Documents. Discuss the PyCharm debugger.
- Technology: docx, pdfminer.six, subprocess, PyCharm
- Homework: structure the annual reports into sections
- Topics: refactor the automation homework, use task scheduler to automate the script locally, discuss AWS technologies to automate the script in the cloud
- Technology: python, _init_.py, Spyder, AWS S3, AWS Lambda, AWS DynamoDB, AWS CloudWatch
- Topics: lemmatization, POS tagging, dependency parsing, rule-based matching
- Technology: SpaCy
- Topics: acronyms, POS phrases, phrase dectection
- Technology: SpaCy, gensim
- Topics: vector space model, TFIDF, BM25, Co-occurance matrix
- Technology: scikit-learn
- Homework: clean text from annual reports
- Topics: reconstruct scikit-learn's CountVectorizer codebase
- Technology: scikit-learn, object oriented Python
- Topics: PCA, latent semantic indexing (LSI), latent dirichlet allocation(LDA), topic coherence metrics, and Word2Vec
- Technology: scikit-learn, gensim
- Homework: Read TamingTextwiththeSVD (ftp://ftp.sas.com/techsup/download/EMiner/) and create topic models for annual report sections,
- Topics: cosine similarity, distance metrics, l1 and l2 norm, recommendation engines
- Technology: scikit-learn, SpaCy, gensim
- Topics: tbd
- Technology: scikit-learn
- Topics: capture, format, and send logging messages to a variety of output. Exception Handling. Create an executable of a python package for deployment
- Technology: scikit-learn, logging, python exceptions, pyinstaller, argparse