Skip to content

This was a sample provided to a financial software development company

Notifications You must be signed in to change notification settings

sfahadrizvi/calculate_stocks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  1. The output.csv file produced from the input.csv provided.

  2. Your thoughts on how you would set up the task to run daily. On linux it can be setup as a crontab job. This functionality is provided by linux and used to run background tasks. On windows you can run this application using the windows scheduler. The problem in windows will be that this is a console application and will show a console when it starts. Even if we free the console in the first instructions a brief black console window will be displayer when the application starts. To prevent this we need to create this as a windows GUI app without any window. A neat solution would actually be to create service and monitor folders for file and then run this application.

  3. Your thoughts on how you would monitor the ongoing execution of your task. To monitor the execution we can add logging capability. A simple log file can log the data processed and any errors encountered, the line number of the processing input so if there is any error in the input it can be corrected easily.

  4. Include the amount of time you spent on the solution. I implemented a simple reading function which uses fgets to read a line from the stdin. This is a very simple function. I was able to get that done in about 40 minutes. I know that reading files in larger chunks is better for file performance so I decided to add another function to read the stdin/file using fread to read large amount of data together. This data is kept in the buffer and all the processing is done from that buffer. This is to maximize the performance by utilizing the data cache and reduce page faults. It took about 30 minutes more to add that function.

  5. Include your thoughts on why you chose your OS/Language/tools. I used C++ programming language and used Visual Studio 2013 to program this file. I chose C++ because of its performance and especially if the idea is to read and process files in Tera bytes then C++ will be significantly faster. I would have used C instead of C++ but I needed to use vectors and there was not enough time to implement linked lists and its algorithms. The final result needed to be sorted and C++ STL has a good sorting function. I used Windows/Visual C++ because that is what I currently have installed on my machine. I am out of station and do not have rights to install other software on my system. I have tied to use standard libraries for C++ and it should work on other platforms too.

Additional Note Regarding Performance: I implemented a simple fgets function to read a line from the stdin/file and use split functionality of the STL. I was getting decent performance from it as well as internally windows has a good file cache mechanism so even though if you read a single line windows will read the whole block and keep it in memory. I have a RAID 0 configuration with 3 SSD hard drives on my system, which a 4 GHz Intel core i7 processor with hyper threading. After verification of correctness I copied the data of the input file over and over again and made the size of the input file about 940 MB and I was getting the total processing time for about 25 seconds. When I implemented the fread and parsed the input in code I was able to get the same output in 18.5 seconds. This is over 20% improvement. On a regular harddrive this will be a lot more. This means roughly a minutes for a 10 GB file and 10 minutes for a Tera byte of data. I see two reasons for speed up. First the fread improves harrdrive performance and there are less system calls i.e. fgets would need a system call per line while 1 fread could reduce the system calls by several thousands. Second since I am parsing the data using a predictable token while it is being read it also reduces split functional call which are not optimal as they are generic. This also improves the performance.

About

This was a sample provided to a financial software development company

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages