This is a challenge for potential candidates. The file uploaded to this github repository is where you will start your investigation. It is access logs for a company website. We believe one of the users accessed another user's account and proceeded to download data. You must find who downloaded the data and what the data that it contained was.
A successful candidate will write a script(s) that will analyze this log file looking for the offending log. Once they have identified the offending activity they will use that data to further write a script(s) to access the data the malicious user downloaded.
Once complete the candidate will commit to a github project their scripts and a synopsis of the downloaded data. They should NOT upload the downloaded data as it contains simulated sensitive data.
After the candidate has completed and uploaded their scripts they will notify the hiring manager of their success and provide them with the GitHub information for the SMEs to review.
Please direct any question through the hiring manager.
- Generative AI (ex chatgpt, bard, copilot) is not allowed. We do not currently utilize or allow the utilization of generative AI so we want to see what your native skills are.
- You can utilize search engines to help you find how to work through the data. We are human and we can't always recall how a command is supposed to be written. You can utilize the internet to help you create your solution by looking up code references. This does not mean you can have someone else write your code for you.
- Show your work, we want to see how you went about getting the information. Please ensure to mask or remove any sensitive informations. This would include API keys.
- Language this is done in should be Python. Other language examples can also be done but first iteration should be in Python.
- Be ready to provide answers on what your code is doing and why you choose to write your code in that manner.
To access the API you will require a header containing the following:
X-API-Key: 4c5162f0
Nothing further should be needed for the API header.
All urls except the one that leads to the downloaded data are made up for the purposes of having content.