Skip to content


  • Pro
Block or Report

Block or report jmbanda

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Hello there 👋, I am Juan M. Banda



Hello, my name is Juan M. Banda and I am currently an assistant professor of computer science at Georgia State University. In my research lab, Panacea Lab, we aim to build machine learning, computer vision, and NLP methods that help to generate insights from multi-modal large-scale data sources. With applications to precision medicine, medical informatics, astroinformatics and other domains, our work addresses domain-specific problems with data science methods and practices. As an engineer at heart and practice for the last 20 years, I have used Python, Bash, ontologies, and NLP tools to build pipelines to annotate over 68 million clinical notes. I have built custom ETLs to map over 8 million patient electronic health records, from 4 institutions, to common data models (OMOP) for large scale analytics and machine learning purposes. I have designed pipelines, databases, and processes to build research infrastructure for my current and previous labs. I have used R, SQL, Matlab, Perl, Java, Javascript, and other languages to acquire, clean and operationalize data from multiple sources. I have mined over 9 billion Tweets for NLP tasks to gain insights from them. In my earlier days, I built content-based image retrieval systems for NASA’s SDO mission, with capacity to process and index over 40,000 images daily, and provide computer vision-aided similarity search for images. I started my engineering days designing and developing point-of-sale systems written in Visual Basic. Apart from my technical skills, I have strong communication and writing skills (over 50 refereed publications) and management skills (I have managed over 40 employees and 20 students). With the desire of improving patient outcomes, medical care and building things that change people’s lives, I am committed to releasing all my work via open-source licenses following the FAIR data sharing principles.

✈️ Yes, that is me in the middle of the picture at the ruins of Abu Simbel. I am an avid traveler and have visited over 100 countries during my travels 🌎.

🛠️ Tools and Technologies

Operating Systems: Windows Centos Ubuntu Macos

Programming Languages: R Python PHP JavaScript Java Matlab C++ HTML5 CSS3 Shell

Databases: PostgreSQL MySQL SQLServer Oracle MongoDB

Cloud Environments: Amazon AWS Microsoft Azure Google Cloud

Other Tools: Tensorflow Pandas WEKA NLTK Spacy numpy ElasticSearch VIM Git GitHub jupyter colab mapreduce spark solr

Currently Learning:


📊 Github Statistics

Juan's github stats Top Langs

Lab-related projects

Project 🚧 Stars Forks 🍴 Issues Pull Requests 🌿
Covid-19 Twitter dataset GitHub stars GitHub Forks GitHub Issues GitHub PRs
Social Media Mining Toolkit GitHub stars GitHub Forks GitHub Issues GitHub PRs
APHRODITE GitHub stars GitHub Forks GitHub Issues GitHub PRs


  1. Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development

    Jupyter Notebook 428 175

  2. Social Media Mining Toolkit (SMMT) main repository

    Python 116 30

  3. [in development]

    R 29 10

  4. Portal to check for drug interactions using more than six public APIs and research datasets

    CSS 5

  5. Drug Safety Portal - Version 2.0 with Researcher Profiles

    PHP 1 1

  6. 🏆 LocalSecrets.Club - TechCrunch Disrupt 2016 Braintree Prize Winning Project

    PHP 1

76 contributions in the last year

Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Mon Wed Fri

Contribution activity

June 2022

Created 3 commits in 1 repository

Seeing something unexpected? Take a look at the GitHub profile guide.