<h1 style="display:inline;">Healthcare Deep Learning with TensorFlow</h1>
<br>
<h2 style="display:inline;">Spiro Ganas, MS</h2>


<a href="https://www.kaggle.com/spiroganas/healthcare-deep-learning-table-of-contents">Table of Contents</a> 
<br>
## Chapter 1 - Introduction

<h3>What Can Deep Learning Do?</h3>

[Deep Learning](https://en.wikipedia.org/wiki/Deep_learning) algorithms have achieved human-level success on tasks such as:


*   Image Recognition
*   Speech Recognition
*   Natural Language Processing
*   Playing games such as Pong, Chess and Go
*   Teaching robots to walk and cars to drive themselves


Personally, I feel the most exciting applications of deep learning are in the medical field.  Recent successes include:

*  Detecting [lung cancer in CT scan images](https://www.kaggle.com/c/data-science-bowl-2017)
*  Detecting [diabetic retinopathy in retinal fundus photographs](https://jamanetwork.com/journals/jama/fullarticle/2588763)
*  Measuring [cardiac function from MRI images](https://www.kaggle.com/c/second-annual-data-science-bowl)
*  Detecting [disease in chest x-rays](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2728630)
*   Determining [disease severity of patients with ulcerative colitis](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2733432)





<h3>What is the goal of this training?</h3>

I work for a research organization where most people are:


*   Medical Doctors
*   Public Health Researchers
*   Biostatisticians
*   Computer Scientists

The goal of this training is to turn these folks into "Healthcare Data Scientists".  



## What is a Healthcare Data Scientist?

Data Science is often depicted as a [venn digram](http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) consisting of three parts:


*   Subject Matter Expertise - in this case, you should know enough about medicine and medical data to develop interesting research questions and an appropriote approch to solving the question.
*   Statistics Skills - [Evidence-based medicine](https://en.wikipedia.org/wiki/Evidence-based_medicine) is the idea that medical decisions should be based on evidence collected from well-designed research.  Statistics is the branch of mathematics that deals with the design of experiments, the collection of data, and the analysis of that data.  Statistics will ultimately determine how your data answers the research question.
*   Computer Science Skills - Data can be "dirty".  Or you can have too much data to fit on a single computer (i.e. "Big Data").  Or the math involved can be so complex that you need to run it on a supercomputer, or on a GPU, or on a cluster of regular computers. The ability to program and other computer science skills distinguish data science from "tranditional" research.

The ultimate goal of all healthcare data scientists is to improve patient care by analyzing medical data and generating evidence that informs clinical practice.




<h3>Why Now?</h3>

Deep Learning is not a new technology, but it has only recently become accessible to most researchers.  There are three key reasons deep learning has become increasingly popular:

1.   Training a large deep learning model involve many, many linear algebra calculations (too many for even the fastest CPUs to handle in a reasonable amount of time).  In 2007, NVidia relased [CUDA](https://en.wikipedia.org/wiki/CUDA), a parallel computing platform that let's you run all this math on a GPU card.  As of July 2019, NVidia's V100 GPU is advertising a speed of [112 Teraflops on deep leaning problems.](https://www.nvidia.com/en-us/data-center/tesla-v100/), offering "the performance of up to 100 CPUs in a single GPU." 
2.   TensorFlow and Keras, two Python libraries that make it easier to design and train neural networks, were first released in 2015.
3.   Cloud computer providers, including Google, Microsoft and Amazon, have made it possible to rent virtual machines that contain GPU hardware and pre-installed deep learning software.  So any researcher can now access state-of-the-art deep learning hardware and software.

<h3>Pedalogical Note</h3>

Whenever possible, I will provide links to some really great training material produced by other folks. That material may include:


*   CoLab Notebooks
*   YouTube Videos
*   Free "Creative Commons Licensed" online books
*   Journal Articles


But whenever something is really, really important, I will include it directly in these CoLab Notebooks.  
I'll also do my best to include easy-to-modify sample code.  

**If you want to become an Expert...**

I'm trying to make this hands-on, "applied" training,  so I won't include a lot of the mathematical details and more arcane Python code.

If your really want all the gory details, I strongly recommend these books:


*   [Python Data Science Handbook ](https://www.amazon.com/Python-Data-Science-Handbook-Essential/dp/1491912057)by Jake VanderPlas
   *   An excellent introduction to the tools used by data scientists.  It discusses Jupyter notebooks, Numpy, Pandas, Matplotlib and the Scikit-Learn machine learning library.
*   [Deep Learning with Python](https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438/) by Francois Chollet
   *   An intermediate-level book that includes the history of machine learning and enough math to let you understand how neural networks work.  The book focuses on the Keras library, which has been adopted as the main programming API in TensorFlow 2.0.    
   
*   [Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow](https://www.amazon.com/gp/product/1492032646/) by Aurélien Géron
   *   A more advanced text (which won't be published until October 2019).  If you understand everything in this book, you're pretty much an expert.



*   [Deep Learning](https://www.deeplearningbook.org/) by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
   *   If you plan on doing PhD-level research on deep learning, this is the book you need.








<h3>What's Next?</h3>

The first step is to introduce you to the Google CoLab environment.  

CoLab let's you write and run Python programs in your web browser.  It also lets you develop TensorFlow deep learning models, and allows you to run those models for free on Google Virtual Machines (VWs) that have GPU hardware designed to speed up the training of your models.

Then I'll provide a quick overview of Python, including some tools Python uses for scientific computing.
