Skip to content

maep13/Data-Source-API-Analyst-Test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Project Task: Data Source Analyst - Improvado

This document outlines the process, approach, and execution of the assigned task for the Data Source Analyst position. The objective is to showcase skills in API research, data extraction, and technical documentation.

Repository: [https://github.com/maep13/Data-Source-API-Analyst-Test]


1. Project Purpose

The primary goal is to simulate a client requirement involving the extraction of specific data from GitHub's public API. The project assesses the following competencies:

  • API Research: Ability to navigate technical documentation and understand endpoint logic, authentication, pagination, and limits.
  • Data Extraction: Use of technical tools to make API calls and retrieve the requested data.
  • Documentation & Troubleshooting: Capability to clearly document the process and anticipate potential issues with corresponding solutions.

2. Approach and Tools

This task uses Google Colab with Python as the primary tool for data extraction, based on these points:

  • Power and Flexibility: Compared to GUI tools like Postman, Python provides full control over extraction logic, complex error handling, pagination, and data transformation.
  • Best Practices: It enables secure coding practices, such as managing authentication tokens through "Secrets" to prevent exposing sensitive data.
  • Task Requirement: The task description itself states that using Google Colab grants bonus points, demonstrating stronger technical proficiency.

3. Deliverables

The final structure of the repository includes:

  • /Content/TROUBLESHOOTING.md: A guide detailing common API errors and their resolutions.
  • /Content/DATA_CLEANING.md: A document explaining the approach for cleaning and processing extracted data.
  • /Postman_Collection/github_api_extraction.ipynb: The Google Colab notebook with the final Python script used for data extraction.
  • README.md: This central document guiding the entire project.

4. Final Reflections

This task has been an excellent practical exercise that realistically simulates the lifecycle of a data extraction requirement. Investigating an API’s documentation, structuring modular code, and documenting both the workflow and potential contingencies reinforces the importance of a methodical approach.

Google Colab proved to be an ideal choice—not only for writing and testing Python code easily, but also for security features like "Secrets," which are essential for professional credential management.

Overall, the project has been a valuable opportunity to demonstrate the technical and analytical skills needed for the Data Source Analyst role.

About

Homework assignment for Data Source API Analyst role

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors