Skip to content
View eliaccess's full-sized avatar
🧬
Working on Deep Learning research topics
🧬
Working on Deep Learning research topics

Highlights

  • Pro
Block or Report

Block or report eliaccess

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Add an optional note:
Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
eliaccess/README.md

banner that says Elias Limouni, portfolio of a junior data scientist

I'm Elias, a Junior Data Scientist 👨‍💻 Working since June 2020💻

  • 🔭 I’m currently working on NLP applications at OppScience
  • 📖 Master of Science in Computer Science at Université de Technologie de Compiègne (French Engineering degree), graduated in 2023
  • 🌱 I'm always learning new techs to keep myself up to date
  • 📫 Feel free to contact me on LinkedIn
  • ⚡ I love making robots, using Machine Learning to improve them

Certifications & Badges

To see the details of the certificates and the authenticity verification page, feel free to click on them. The diplomas are available on my LinkedIn profile.

IBM Data Science Professional Certificate

Machine Learning Specialization Certificate

Deep Learning Specialization Certificate

Machine Learning Engineering for Production (MLOps) Specialization Certificate

Skills

To get the name of the skill, place your cursor on it.

Data Science

Python Google Colab Jupyter-Notebook Scikit-learn Seaborn Pandas TensorFlow PyTorch OpenCV Keras R MatLab Elastic Search MySQL

Robotics & IoT

Linux Raspberry Pi Arduino C C++ ROS, Robot Operating System

DevOps & Miscs

Git GitLab GitHub Bash GCP Docker Nginx Node.js Express.js WordPress PHP Java HTML5 CSS3 Latex

Projects

To improve the lisibility of my projects, here is a legend of the emojis in the title of the projects:

  • 🔒 : private project, for a company for example, so I can not show the code
  • 🧮 : data science project
  • 🤖 : robotics / IoT project
  • 📚 : educational project
  • 👨 : personal project

List of the projects:

🧮 📚 Winning Space race with Data Science

In order to get the IBM Professional Data Science Certificate, I developed solutions in order to solve this applied Data Science capstone.

SpaceX intends to reduce the costs of spatial flights by reusing the first stage of their rockets. This project goal is to predict if the first stage of a Falcon 9 rocket will lang back to Earth successfully.

Data Collection on two sources for the Applied Data Science Capstone from IBM

The data used was collected using the SpaceX REST API, and a Wikipedia article about the Falcon 9 rockets. I performed Web Scraping in order to extract useful data from the second data source.

I performed an Exploratory Data Analysis using Folium to make a map to study distances, Dash to create an interactive dashboard, and of course other Python libraries to create scatterplots, bar charts and lots of other visualizations.

Dashboard created with Dash from Plotly for the Applied Data Science Capstone from IBM

I used 4 classification models and optimized their parameters to find the best solution to solve the problem.

At the end of the project, I created a synthetic but complete 45 pages report to explain the work using a storytelling method.

🧮 📚 Regression ML algorithms for CVSS estimation

This project comes from an issue : it is important in some fields to evaluate the danger of a cyber attack, to sort their correction priority. Therefore, the CVSS, which goes from 0 to 10, evaluates this criticity. Using Machine Learning to evaluate this risk might be a good solution to make it quicker based on the stack and other information. In this Notebook, we will try this solution, using the regressive approach.

Learning curve of the Random Forest model

Most of the work done is a data exploration, using matplotlib and seaborn most of the time to draw correlations and highlight useful information for future use. I have used scikit learn and pandas to manipulate the data and to create preprocessing and modeling functions, to get the best combination of preprocess+models, using the mean median error as the most representative metric for performance.

This project is a project done for researchers, at university, to detect cyber attacks on autonomous trains.

🧮 🔒 Multilingual NER models evaluation

The Named Entity Recognition is a Natural Language Processing domain. It is a problem of automatic data analysis, consisting in extracting a type of entity from a text. A NER model can for example extract all the people, dates, locations etc. from a document:

Schema of entities extracted, NER

These models are usually monolingual. However the company needed to explore the possibility of using one model to extract entities from lots of documents, in 5 languages. This would allow the company to avoid its customers from deploying 5 different models, and from detecting the language of each document analyzed.

To do so, I found lots of annotated datasets (from Kaggle and other sources), several pre-trained models, and designed a benchmark to calculate the metrics of the models. To understand why some results were low, I analyzed the results by tag, like this:

Example of the scores of tags in a NER task

This further study allowed me to understand the semantic reasons for these disparities in results. It allowed the team to correct this issue by using transfer learning, such as fine tuning for example.

🧮 🔒 OCR benchmark and preprocess optimization

The company needed to quantify its OCR tool performances. An Optical Character Recognition is a Computer Vision technology that extracts the text from an image.

OCR picture working

Therefore, I designed a benchmark to do so. First, I had to think about how to evaluate an OCR:

  • which preprocessing functions ?
  • what kind of data ?
  • how to quantify the quality that results from an OCR ?

I chose three criterias to find their impact on the metrics: the font, the font size, and the quality (dimensions) of the document. I studied the impact of the gray scale and the rotation preprocessing functions. Based on these choices, I searched on the internet several datasets to have a representative rate of documents. I then standardized them to respect the HOCR format, cutting the picture in boxes, to locate the text. This allowed me to match the extracted text to its true value.

To quantify the capacities of the OCR, I used a Levenshtein distance calculator function I optimized to calculate the precision, recall and f1 score.

After that, I improved the rotation preprocessing function to reduce the processing time, based on the determined angle.

At the end of the project, I made a presentation to explain all the propositions to the other members of the team to decide what changes we must integrate into the program.

🧮 🔒 Machine Learning for Sentence Bounding Detection capabilities

To improve an NLP processing, the company needed to ameliorate the preprocessing of the data. Tokenization is one of them, and is essential. There are several ways to tokenize a text, but one of them is to cut it in paragraphs, to extract the sentences from each paragraph, and to get each word and punctuation of these sentences. The interesting fact in this analyze is that it keeps the fact that two words in the same sentence are more linked than if they are in separate ones. The issue is that sentences can end differently than a dot, an exclamation point or any other punctuation. Sentence Boundary Detection is an actual NLP problematic, and models exist to do this task. Therefore, I had to evaluate the actual models of the company's solution, and then to try to find SBD models that do better.

What is SBD

To do so, I had to find SBD datasets, that contained various data forms such as tables, headers, lists etc., and standardize their format to be able to evaluate models in the most complete way possible. Then, I made a state of the art to list the available models. Then I created a benchmark to test these models, using Python, and approaching the problem by a binary classification (0 if the index is a bound, else 1). Then, I was able to evaluate the precision, recall, and f1 score of the 'is a bound' class (1).

The main issue I faced in the SBD problem is that there are many ways to consider that a substring of a text is a sentence or not. It really depends on the annotated dataset used to train the model, as shown on the next picture. That is why I had to add some tolerance to be objective.

Different kind of bounds

I tried lots of models, some that are syntaxic only (such as PySBD), or complax models using Neural Networks (for example Stanza). At the end of the study, I made a presentation with my analyzis of the limits of each models (punctuation impact mostly), to make one better model.

On the best model, I obtained very good results that hugely improved the solution:

Different kind of bounds

🧮 📚 Classification model on the Adults Income dataset

As a project for a Machine Learning course, I had to explore with a teamate solutions to predict if the income of people were less or more than 50K dollars a year (binary classification). The only two rules to respect were:

  • to find quickly a solution (we had maximum 10 hours per person on this project)
  • to use the famous Adults Income dataset

Metrics obtained with a model

We used a Jupyter Notebook to capitalize all the work done on the study. To process algorithms on the dataset, we used libraries such as Pandas, Seaborn (for the data exploration), and Scikit-learn to test several models. We optimized the preprocessing to get the best processing chain.

As I usually do, I have done the study in 4 parts:

  • exploratory data analysis, using graphs, statistics, plots etc.
  • data preparation, by creating several preprocessing chains
  • modelisation, by creating several models, each one using every preprocessing-model combination
  • evaluation, to know the performances of our models, and making sure we had no under/over fitting

Training and validation scores depending on the data learned on

We did no optimization on this model, because of the time we had, but we could have optimized some parameters of the best model we got, using GridSearchCV for example.

After doing the study, we have presented it orally using a Power Point presentation to explain our choices and the results we obtained.

🤖 📚 Following green target for Turtle Bot 3 Burger

The goal of this project was to develop a multithreaded program to make a robot follow a green target, using only:

  • A camera
  • A LiDAR

The project had requirements to respect:

  • Stop the robot if any obstacle is closer than 15 centimeters from the robot, all around it
  • Make it follow a colored target that would move in front of the target

To respect these needs, I have implemented several features, using Python and the ROS library to parallelize the data processing in 7 nodes, as explained on the following picture.

ROS nodes and topics

To optimize the robot’s motion, I have implemented a distance estimator that uses the image of the target, knowing its real size. Therefore, I have added a function to make the robot go backward if the target was too close.

Relation between the size of an object and its size seen by the camera

Connect with me


Popular repositories

  1. Subscriber-Count Subscriber-Count Public

    A simple JS script to show your subscriber count on a webpage.

    HTML 5 3

  2. NF92-Rapport-Latex NF92-Rapport-Latex Public

    Rapport à réaliser en Latex au cours de l'UV NF92 à l'UTC

    TeX 1

  3. eliaccess eliaccess Public

    My personal portfolio !

    1

  4. Following-Cart Following-Cart Public

    This project is a following cart, that allows you to move heavy objects without effort, just by charging it and walking.

    C++

  5. NF92-Site-Auto-Ecole NF92-Site-Auto-Ecole Public

    Site d'auto école à réaliser au cours de l'UV NF92 à l'UTC

    PHP

  6. Site-Youascapegame Site-Youascapegame Public

    Escape Game du clan "Youarille" pour l'intégration des nouveaux élèves à l'UTC.

    HTML

123 contributions in the last year

No contributions on February 19th.No contributions on February 26th.No contributions on March 5th.No contributions on March 12th.No contributions on March 19th.No contributions on March 26th.No contributions on April 2nd.No contributions on April 9th.No contributions on April 16th.No contributions on April 23rd.No contributions on April 30th.No contributions on May 7th.No contributions on May 14th.No contributions on May 21st.6 contributions on May 28th.No contributions on June 4th.No contributions on June 11th.No contributions on June 18th.No contributions on June 25th.No contributions on July 2nd.No contributions on July 9th.No contributions on July 16th.No contributions on July 23rd.No contributions on July 30th.No contributions on August 6th.No contributions on August 13th.No contributions on August 20th.5 contributions on August 27th.8 contributions on September 3rd.No contributions on September 10th.No contributions on September 17th.No contributions on September 24th.No contributions on October 1st.No contributions on October 8th.No contributions on October 15th.No contributions on October 22nd.No contributions on October 29th.No contributions on November 5th.No contributions on November 12th.1 contribution on November 19th.No contributions on November 26th.No contributions on December 3rd.No contributions on December 10th.No contributions on December 17th.No contributions on December 24th.No contributions on December 31st.No contributions on January 7th.No contributions on January 14th.No contributions on January 21st.No contributions on January 28th.No contributions on February 4th.No contributions on February 11th.No contributions on February 18th.No contributions on February 20th.1 contribution on February 27th.No contributions on March 6th.1 contribution on March 13th.No contributions on March 20th.No contributions on March 27th.No contributions on April 3rd.No contributions on April 10th.No contributions on April 17th.No contributions on April 24th.No contributions on May 1st.No contributions on May 8th.No contributions on May 15th.No contributions on May 22nd.No contributions on May 29th.No contributions on June 5th.No contributions on June 12th.No contributions on June 19th.No contributions on June 26th.No contributions on July 3rd.No contributions on July 10th.No contributions on July 17th.No contributions on July 24th.No contributions on July 31st.No contributions on August 7th.No contributions on August 14th.No contributions on August 21st.8 contributions on August 28th.No contributions on September 4th.No contributions on September 11th.No contributions on September 18th.No contributions on September 25th.No contributions on October 2nd.No contributions on October 9th.No contributions on October 16th.No contributions on October 23rd.No contributions on October 30th.No contributions on November 6th.No contributions on November 13th.No contributions on November 20th.3 contributions on November 27th.No contributions on December 4th.No contributions on December 11th.No contributions on December 18th.No contributions on December 25th.No contributions on January 1st.No contributions on January 8th.No contributions on January 15th.No contributions on January 22nd.No contributions on January 29th.No contributions on February 5th.No contributions on February 12th.No contributions on February 19th.1 contribution on February 21st.No contributions on February 28th.3 contributions on March 7th.No contributions on March 14th.4 contributions on March 21st.No contributions on March 28th.1 contribution on April 4th.No contributions on April 11th.No contributions on April 18th.No contributions on April 25th.No contributions on May 2nd.No contributions on May 9th.No contributions on May 16th.No contributions on May 23rd.No contributions on May 30th.No contributions on June 6th.No contributions on June 13th.No contributions on June 20th.No contributions on June 27th.No contributions on July 4th.No contributions on July 11th.No contributions on July 18th.No contributions on July 25th.No contributions on August 1st.No contributions on August 8th.No contributions on August 15th.No contributions on August 22nd.1 contribution on August 29th.3 contributions on September 5th.No contributions on September 12th.No contributions on September 19th.No contributions on September 26th.No contributions on October 3rd.No contributions on October 10th.No contributions on October 17th.No contributions on October 24th.No contributions on October 31st.No contributions on November 7th.No contributions on November 14th.No contributions on November 21st.8 contributions on November 28th.No contributions on December 5th.No contributions on December 12th.No contributions on December 19th.No contributions on December 26th.No contributions on January 2nd.No contributions on January 9th.1 contribution on January 16th.No contributions on January 23rd.No contributions on January 30th.No contributions on February 6th.No contributions on February 13th.No contributions on February 20th.No contributions on February 22nd.No contributions on March 1st.2 contributions on March 8th.1 contribution on March 15th.No contributions on March 22nd.No contributions on March 29th.No contributions on April 5th.No contributions on April 12th.No contributions on April 19th.No contributions on April 26th.No contributions on May 3rd.No contributions on May 10th.No contributions on May 17th.No contributions on May 24th.No contributions on May 31st.No contributions on June 7th.No contributions on June 14th.No contributions on June 21st.No contributions on June 28th.No contributions on July 5th.No contributions on July 12th.No contributions on July 19th.No contributions on July 26th.No contributions on August 2nd.3 contributions on August 9th.No contributions on August 16th.No contributions on August 23rd.No contributions on August 30th.3 contributions on September 6th.No contributions on September 13th.1 contribution on September 20th.No contributions on September 27th.No contributions on October 4th.No contributions on October 11th.No contributions on October 18th.2 contributions on October 25th.No contributions on November 1st.No contributions on November 8th.No contributions on November 15th.No contributions on November 22nd.1 contribution on November 29th.No contributions on December 6th.No contributions on December 13th.No contributions on December 20th.No contributions on December 27th.No contributions on January 3rd.No contributions on January 10th.No contributions on January 17th.No contributions on January 24th.No contributions on January 31st.No contributions on February 7th.No contributions on February 14th.No contributions on February 21st.No contributions on February 23rd.No contributions on March 2nd.1 contribution on March 9th.2 contributions on March 16th.2 contributions on March 23rd.No contributions on March 30th.No contributions on April 6th.No contributions on April 13th.No contributions on April 20th.No contributions on April 27th.No contributions on May 4th.No contributions on May 11th.No contributions on May 18th.No contributions on May 25th.No contributions on June 1st.No contributions on June 8th.No contributions on June 15th.No contributions on June 22nd.No contributions on June 29th.No contributions on July 6th.No contributions on July 13th.No contributions on July 20th.No contributions on July 27th.No contributions on August 3rd.6 contributions on August 10th.No contributions on August 17th.No contributions on August 24th.No contributions on August 31st.No contributions on September 7th.No contributions on September 14th.No contributions on September 21st.No contributions on September 28th.No contributions on October 5th.No contributions on October 12th.No contributions on October 19th.No contributions on October 26th.No contributions on November 2nd.No contributions on November 9th.No contributions on November 16th.No contributions on November 23rd.No contributions on November 30th.No contributions on December 7th.No contributions on December 14th.No contributions on December 21st.No contributions on December 28th.No contributions on January 4th.No contributions on January 11th.No contributions on January 18th.No contributions on January 25th.5 contributions on February 1st.No contributions on February 8th.No contributions on February 15th.No contributions on February 22nd.No contributions on February 24th.No contributions on March 3rd.No contributions on March 10th.8 contributions on March 17th.No contributions on March 24th.No contributions on March 31st.No contributions on April 7th.No contributions on April 14th.No contributions on April 21st.1 contribution on April 28th.No contributions on May 5th.No contributions on May 12th.No contributions on May 19th.No contributions on May 26th.No contributions on June 2nd.No contributions on June 9th.No contributions on June 16th.No contributions on June 23rd.No contributions on June 30th.No contributions on July 7th.No contributions on July 14th.No contributions on July 21st.No contributions on July 28th.No contributions on August 4th.4 contributions on August 11th.2 contributions on August 18th.No contributions on August 25th.1 contribution on September 1st.No contributions on September 8th.No contributions on September 15th.No contributions on September 22nd.No contributions on September 29th.No contributions on October 6th.No contributions on October 13th.No contributions on October 20th.No contributions on October 27th.No contributions on November 3rd.No contributions on November 10th.No contributions on November 17th.5 contributions on November 24th.No contributions on December 1st.No contributions on December 8th.No contributions on December 15th.No contributions on December 22nd.No contributions on December 29th.No contributions on January 5th.No contributions on January 12th.No contributions on January 19th.No contributions on January 26th.No contributions on February 2nd.No contributions on February 9th.No contributions on February 16th.No contributions on February 23rd.2 contributions on February 25th.No contributions on March 4th.No contributions on March 11th.No contributions on March 18th.No contributions on March 25th.No contributions on April 1st.No contributions on April 8th.No contributions on April 15th.No contributions on April 22nd.No contributions on April 29th.No contributions on May 6th.No contributions on May 13th.No contributions on May 20th.No contributions on May 27th.No contributions on June 3rd.No contributions on June 10th.No contributions on June 17th.No contributions on June 24th.No contributions on July 1st.No contributions on July 8th.No contributions on July 15th.No contributions on July 22nd.No contributions on July 29th.No contributions on August 5th.7 contributions on August 12th.No contributions on August 19th.No contributions on August 26th.9 contributions on September 2nd.No contributions on September 9th.No contributions on September 16th.No contributions on September 23rd.No contributions on September 30th.No contributions on October 7th.No contributions on October 14th.No contributions on October 21st.No contributions on October 28th.No contributions on November 4th.No contributions on November 11th.No contributions on November 18th.No contributions on November 25th.No contributions on December 2nd.No contributions on December 9th.No contributions on December 16th.No contributions on December 23rd.No contributions on December 30th.No contributions on January 6th.No contributions on January 13th.No contributions on January 20th.No contributions on January 27th.No contributions on February 3rd.No contributions on February 10th.No contributions on February 17th.No contributions on February 24th.
Contribution Graph
Day of Week February March April May June July August September October November December January February
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Contribution activity

February 2024

5 contributions in private repositories Feb 1

Seeing something unexpected? Take a look at the GitHub profile guide.