-
Notifications
You must be signed in to change notification settings - Fork 0
tomkaith13/memcachedClone
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
README for MiniMemcachedServer: API Details: This memcached clone attempts to create a small subset of the original memcached server based on the memcached text protocol. The operations implemented are : 1) SET - setting a key,value 2) GET - getting a value held by Key. 3) DELETE - Delete a key-val pair based on given key. 4) GETS - obtain the key-value pair along with a 64-bit cas_unique id 5) CAS - Pass the key-value and cas_unique id and the new value is set based on the fact that cas_unique is changed or not. 6) QUIT - exit the connection The details and the syntax of the memcached text protocol can be found here: https://raw.githubusercontent.com/memcached/memcached/master/doc/protocol.txt =============================================================================== DESIGN CHOICES: This server was designed and implemented using C++11. The language choice was purely because of interest to implement many of the concurrency features myself. This could easily be implemented in other languages. It uses C++11 built-in thread support which makes it portable to most Unix platforms. Also, since memcached was written in C, I would assume that using a low-level language is good for performance. The choice was C++11 primarily to utilize several built-in functionality of STLs and concurrency language constructs. The networking functionality is implemented using Unix Socket Programming APIs. Again, portability was the main reason for this choice. Currently, all socket programming APIs are made using embedded into the main MiniMemcached class. Probably we should isolate socket calls which into a separate class if Windows support needs to be added. Configuration Limits: The server configuration parameter are set when the server is instantiated. The default value of the CONFIG parameters can be found at the start of MiniMemcachedServer.hpp as macros. The server admin can also provide custom config parameters via the MiniMemcached constructor. An improvement here would be the ability to read config parameters via some config file rather than relying on user input. Here are the config parameters (with their default values): MAX_BYTES_LIMIT 100 MAX_CONNECTIONS_LIMIT 20 CONNECTION_TIMEOUT 90 (in seconds althought microseconds option can be available) MAX_CACHE_SIZE 100 (total cache size) DEFAULT_PORT "11211" MiniMemcached was implemented as a class which using threads. The server's main thread is the one which accepts any connection from the client. There can be a total of MAX_CONNECTIONS_LIMITS number of clients connected simultaneously to the server. Normally, the if this number is exceeded, the user will be prompted for a message stating to try again later. However, in this server's design, we added a conditional variable which causes the server to wait until the total number of accepted clients are below the MAX_CONNECTION_LIMIT. This way, the user does not have to poll, but can wait until one of the user connection is terminated via a QUIT command or a timeout. The MAX_CONNECTIONS_LIMIT is serviced by the MiniMemcached server via a ThreadPool which creates MAX_CONNECTIONS_LIMIT number of threads. The advantages of using a ThreadPool are: 1) Number of threads are created initially by the server and stay alive for future use. 2) The onus of an internal system call of a lwp+kernel thread creation is taken on even before the traffic to the server is begun. This way there is limited user to kernel mode switch. 3) Threads created by the Pool execute a member function of MiniMemcached. Disadvantages: 1) ThreadPool == shared mutable state. This means we need mutexes to control the access to critical sections of the class. These critical sections should be very small to avoid blocking. 2) The current threadpool is completely static and implemented by us. It does not currently have the feature to grow the number of threads in the pool if the need arises. This could be a add-on feature. 3) The number of threads that can be spawned in the single machine is limited. The limit can be viewed via `ulimit -a` command. Alternative to ThreadPool: As discussed, the cons of using ThreadPool, is we can only achieve vertical scaling depending on the resource of our server. One can uses the Actor model which involves creating processes and message passing. This will help us add more horizontal scaling. The reason for choicing threadpool for now, is time-constraint and ability to code up something using simple C++ constructs rather than using third party libraries. Coming back to the rest of the funcionality, once the server receives the new socket fd per client, we delegate the processing of each client to a single thread in the pool. Each server waits for commands to be sent by the client and wait for new data to be entered until CONNECTION_TIMEOUT seconds. The connection timeout functionality is implemented using select(2). Alternative choices to using select(2) a slow syscall are using libraries like libevent or some other asynchronous IO provided by the kernel. Select(2) was ultimately chosen purely for portability. The best way to pick performance vs portability is when the admin knows the OS/machine where the server will be run. Portability is important if the server is running on a cloud which can migrate the compute instance between OSes. The clients currently can connect to the server via a simple telnet command. Example: telnet <ip-address> <port-number> Once the connection is successful, the user receives a '>' prompt to entering more commands. In the current implementation, command input validation is strict. Any deviation from the syntax mentioned in the protocol RFC will not be handled and will generate an ERROR message. Extra whitespaces are not handled currently. This could be an improvement for later. The cache limit is can be set initially by constructor of the class. The default value is MAX_CACHE_SIZE. An improvment here would be to add some sort of cache cleansing algorithm using either Least Recently Used (LRU) or Least Frequently Used (LFU) algorithm. This could be a good improvement. Another improvement that is currently not handled by this version is client side API which enable sending and receiving key-val pairs to the server programmatically rather than using telnet. The use of mutexes while reading and writing will create a choke point for mutiple readers trying to read the same data. The easy solution to solve this issue is to use shared_mutex which is C++'s built for read-write locks which solves reader starvation. However, C++11 does not have built-in support for shared_mutex yet. It is only available in C++17 which is not available on Xcode. =============================================================================== HOW TO BUILD: The project comes with a xcode project which should work on any mac. If the user prefers building via commandline. Just unzip the target and issue the command: bash# g++ -std=c++11 main.cpp ./src/*cpp -I ./src -o memcached followed by: bash# ./memcached Please ensure the build tool chain is currently installed in your server. If you are running it on the mac install Xcode. =============================================================================== TESTING: Inorder to run this server, we would need to use a test program in C++11. The easiest way to go about it is to use the MiniMemcached's constructor. MiniMemcached(string portNum = DEFAULT_PORT, int connectionCount = MAX_CONNECTIONS_LIMIT, int cacheSize = MAX_CACHE_SIZE, int connectTimeout = CONNECTION_TIMEOUT, string configFile = ""); As mentioned above, defaults can be found as macros in MiniMemcached.hpp Once we create the server object, all we need to do is to run serverObj.initServer(); This will cause it to run. Ideally, this would run as a deamon. However that could be phase two of the project. serverObj.shutdown is meant for graceful shutdown where it stops the threadpool. And frees the internally allocated memory. However, its currently expects to be terminated by a SIGINT or Ctrl+C. Once the server is running, user can connect to the server via bash# telenet <localhost/ip address> 11211 The usage of this server can be seen in the main.cpp found in /src Test Automation of the current project can only be done via expect scripts. ======================== Actual Requirements =================================== This exercise is meant to demonstrate: An understanding of servers, networking, and protocols. An understanding of concurrency, performance, and resource constraints, and an ability to anticipate future issues and implement solutions. An ability to write clear, easy to understand code, communicate your approach, and reason about tradeoffs that you have made. The exercise: Write a memcache server that has support for the Get, Set, and Delete methods using either the Memcache Text Protocol or the Memcache Binary Protocol. Include support for Compare And Swap, either by implementing the GETS and CAS methods in the text protocol or by using the CAS field in the binary protocol. You don't need to implement support for the expiry fields. Support a configurable limit for the resource consumption of your service. Handle concurrency issues that arise in an environment with multiple simultaneous writers and readers. Discussion: Tell us what you expect the performance of your implementation to be, and where you anticipate bottlenecks. Suggest ways in which the performance, reliability, or other limitations of your implementation could be improved upon in the future. Suggest or implement ways in which your service could be monitored and managed in a production setting. Show how you tested your server. Provide instructions on what we need to do to build and run your server locally on a standard unix-like development environment (e.g. Mac laptop or Linux server). Notes: You should choose an appropriate language and runtime given the requirements of the exercise. Feel free to run your choice by us if you have questions. You may use basic collections and support classes from the language of your choice, but you are responsible for implementing the resource constraints, eviction algorithm, and concurrency control. We don't expect you to reimplement all of memcache in a few hours. (Please don't try to!) It's more important that your solution work and be complete, and that you can reason about the performance tradeoffs and potential concurrency issues that you'd expect given real-world usage. Please provide all source code via a Git repository – create a local repo for your code, and when the exercise is complete, please tar or zip your main directory and send it along to us (and please remember to include your .git directory/files). If you’re not comfortable with Git, you may simply provide a tgz of the source code.
About
simple memcached clone using socket programming and threadpool
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published