Permalink
Browse files

Update pages

  • Loading branch information...
mythz committed Aug 17, 2014
1 parent bb38baf commit 59e0ad03cab015eb363fd6cb6478ece1cd43de47
@@ -0,0 +1,78 @@
# NoSQL and RDBMS – Choose your weapon.

nosql_thumb Sensationalist headline right? Unfortunately I think the aggressive tone of the term ‘NoSQL’ is one of the reasons that a lot of people have an instant resentment to the technology. It encourages flame ignited posts like http://teddziuba.com/2010/03/i-cant-wait-for-nosql-to-die.html which when posted to Slashdot will get every developer who has ever touched an RDBMS to weigh-in and pass judgement on technology that they’ve never used before in a combined post also declaring their eternal love for their preferred RDBMS of choice.

The negative posts generally share the same tone:

> I have developed with RDBMS for 10 years and I’ve never needed to use a NoSQL database. RDBMS can scale just as good as NoSQL.
Unfortunately statements like the above instantly illustrate the developer has a biased attachment to a technology they’ve used all their life whilst at the same time declare they have absolutely no knowledge (or desire to gain any knowledge) on the subject for which they are passing judgement. It’s most likely these developers have also made message queues fit in databases and marvelled at their configuration-mapping ability to have an eagerly-loaded chain of nested objects auto-magically bind to their pristine domain model. Yes this is quite a feat to be proud of, unfortunately it also happens to be a one-liner in a lot of non-relational databases. This characteristic of being able to serialize your domain model without requiring it to be mapped to a database using an ORM is not a feature limited to NoSQL databases, other data persistence solutions like db4o (an object orientated database) achieve this equally as well.

## Picking the best tool for the job?

All this says is that RDBMS’s are really good at doing what they do, which is storing flat, relational, tabular data. Now believe it or not they still remain the best solution for storing relational data. Using a NoSQL data store isn’t an all-or-nothing technology. It is actually serves as a good complementary technology to have along-side an RDBMS. Yes that’s right even though they have overlapping feature-set they can still be great together. Awesome – we can all still be friends!

It’s still all about picking the right tool and using the right technology for the task at hand. Which leads me to what NoSQL databases are naturally good at:

- Performance – As everything is retrieved by key, effectively all your queries hits an index. Redis an in-memory data-store (with optional async persistence) can achieve 110000 SETs/second, 81000 GETs/second in an entry level Linux box, and no this is not possible with any RDBMS.
- Replication – A feature common in most NoSQL data stores is effortless replication. In Redis this is achieved by un-commenting one line: ‘slaveof ipaddress port’ in redis.conf
- Schema-less persistence – As there is no formal data structure to bind to and most values are stored as binary or strings the serialization format is left up to you. Effectively this can be seen as a feature as it leaves you free to serialize your object directly – which lets you do those one-liner saves that everyone is talking about. A lot of client libraries opt for a simplistic language-neutral format like JSON.
- Scalability – This seems to be a heated topic (where everyone believes they can scale their technology of choice equally as well given the right setup) so I won’t delve in to this too deeply only to say that key-value data-stores by their nature have good characteristics to scale. When everything is accessed by key, clients can easily predict the source of data given a pool of available data-stores. Most clients also come in-built with consistent hashing where the addition or removal of a data store does not significantly impact this predictability.
- Efficiency and Cost – As there are a plethora of options available most NoSQL data stores are both free and open source. They also perform better and provide better utilization of server resources than comparative RDBMS solutions.
- Advanced data constructs – NoSQL variants like Redis, in addition to a key-value data store also provide rich data constructs and atomic operations on server-side lists, sets, sorted sets and hashes which make things like message-queuing, notification systems, load-balancing work tasks trivial to implement.

## Try NoSQL today

redis Fortunately NoSQL solutions are not black magic and are actually fairly easy to get started with. My personal favourite is Redis for which I also happen to be the maintainer of a rich open source C# client (can also run on Linux with Mono). If .NET is not your thing, than you’re in luck as Redis is so popular that there is a language binding in almost every programming language in active use today which you can find listed on its supported languages page.

Getting started is as easy as downloading the latest source from the project website. If you’re on a windows platform you can download pre-compiled binaries using cygwin here. A simple make command from the tarball directory creates the required redis-server which is all you need to run to get a server instance up and running.

After that you can access the comprehensive Redis feature-set exposed by the C# IRedisClient API.
To give you a taste of it’s simplicity, here is an example demonstrating how to persist and access a simple POCO type using the Redis client:

```csharp
public class IntAndString
{
public int Id { get; set; }
public string Letter { get; set; }
}
using (var redisClient = new RedisClient())
{
//Create a typed Redis client that treats all values as IntAndString:
var typedRedis = redisClient.GetTypedClient<IntAndString>();
var pocoValue = new IntAndString { Id = 1, Letter = "A" };
typedRedis.Set("pocoKey", pocoValue);
IntAndString toPocoValue = typedRedis.Get("pocoKey");
Assert.That(toPocoValue.Id, Is.EqualTo(pocoValue.Id));
Assert.That(toPocoValue.Letter, Is.EqualTo(pocoValue.Letter));
var pocoListValues = new List<IntAndString> {
new IntAndString {Id = 2, Letter = "B"},
new IntAndString {Id = 3, Letter = "C"},
new IntAndString {Id = 4, Letter = "D"},
new IntAndString {Id = 5, Letter = "E"},
};
IRedisList<IntAndString> pocoList = typedRedis.Lists["pocoListKey"];
//Adding all IntAndString objects into the redis list 'pocoListKey'
pocoListValues.ForEach(x => pocoList.Add(x));
List<IntAndString> toPocoListValues = pocoList.ToList();
for (var i=0; i < pocoListValues.Count; i++)
{
pocoValue = pocoListValues[i];
toPocoValue = toPocoListValues[i];
Assert.That(toPocoValue.Id, Is.EqualTo(pocoValue.Id));
Assert.That(toPocoValue.Letter, Is.EqualTo(pocoValue.Letter));
}
}
```

Other note-worthy features of Redis include its support for custom atomic transactions examples of which are here.

More examples are available at the ServiceStack’s Open source C# Client’s home page.
@@ -0,0 +1,78 @@
# History of REST, SOAP, POX and JSON Web Services

The W3C defines a “web service” as “a software system designed to support interoperable machine-to-machine interaction over a network.
The key parts of this definition are that it should be interoperable and that it facilitates communication over a network. Unfortunately over the years different companies have had different ideas on what the most ideal interoperable protocol should be, leaving a debt-load of legacy binary and proprietary protocols in its wake.

## HTTP the defacto web services transport protocol

HTTP the Internet’s protocol is the undisputed champ and will be for the foreseeable future. It’s universally accepted, can be proxied and is pretty much the only protocol allowed through most firewalls which is the reason why Service Stack (and most other Web Service frameworks) support it. Note: the future roadmap will also support the more optimized HTML5 ‘Web Sockets’ standard.

## XML the winning serialization format?
Out of the ashes another winning format looking to follow in HTTP’s success, is the XML text serialization format. Some of the many reasons why it has reigned supreme include:

- Simple, Open, self-describing text-based format
- Human and Computer readable and writeable
- Verifiable
- Provides a rich set of common data types
- Can define higher-level custom types

XML doesn’t come without its disadvantages which currently are centred around it being verbose and being slow to parse resulting wasted CPU cycles.

## REST vs SOAP

Despite the win, all is not well in XML camp. It seems that two teams are at odds looking to branch the way XML is used in web services. On one side, I’ll label the REST camp (despite REST being more than just XML) approach to developing web services is centred around resources and prefers to err on simplicity and convention choosing to re-use the other existing HTTP metaphors where they’re semantically correct. E.g. calling GET on the URL http://host/customers will most likely return a list of customers, whilst POST‘ing a ‘Customer’ against the same url will, if supported append the ‘Customer’ to the existing list of customers.

The URL’s used in REST-ful web services also form a core part of the API, it is normally logically formed and clearly describes the type of data that is expected, e.g. viewing a particular customers order would look something like:

GET http://location/customers/mythz/orders/1001 – would return details about order ’1001′ which was placed by the customer ‘mythz’.

The benefit of using a logical URL scheme is that other parts of your web services API can be inferred, e.g.

GET http://location/customers/mythz/orders – would return all of ‘mythz’ orders
GET http://location/customers/mythz – would return details about the customer ‘mythz’
GET http://location/customers – would return a list of all customers

If supported, you may have access to different operations on the same resources via the other HTTP methods: POST, PUT and DELETE. One of the limitations of having a REST-ful web services API is that although the API may be conventional and inferable by humans, it isn’t friendly to computers and likely requires another unstructured document accompanying the web services API identifying the list, schema and capabilities of each service. This makes it a hard API to provide rich tooling support for or to be able to generate a programmatic API against.

> NOTE: If you’re interested in learning more about REST one of the articles I highly recommend is http://tomayko.com/writings/rest-to-my-wife
## Enter SOAP

SOAP school discards this HTTP/URL nonsense and teaches that there is only one true METHOD – the HTTP ‘POST’ and there is only one url / end point you need to worry about – which depending on the technology chosen would look something like http://location/CustomerService.svc. Importantly nothing is left to the human imagination, everything is structured and explicitly defined by the web services WSDL which could be also obtained via a url e.g. http://location/CustomerService.svc?wsdl. Now the WSDL is an intimately detailed beast listing everything you would ever want to know about the definition of your web services. Unfortunately it’s detailed to the point of being unnecessarily complex where you have layers of artificial constructs named messages, bindings, ports, parts, input and output operations, etc. most of which remains un-utilized which a lot of REST folk would say is too much info that can be achieved with a simple GET request :)

What it does give you however, is a structured list of all the operations available, including the schema of all the custom types each operation accepts. From this document tools can generate a client proxy into your preferred programming language providing a nice strongly-typed API to code against. SOAP is generally favoured by a lot of enterprises for internal web services as in a lot of cases if the code compiles then there’s a good chance it will just work.

Ultimately on the wire, SOAP services are simply HTTP POSTs to the same endpoint where each payload (usually of the same as the SOAP-Action) is wrapped inside the body of a ‘SOAP’ envelope. This layer stops a lot of people from accessing the XML payload directly and have to resort to using a SOAP client library just to access the core data.

This complexity is not stopping the Microsoft’s and IBM’s behind the SOAP specification any-time soon. Nope they’re hard at work finishing their latest creations that are adding additional layers on top of SOAP (i.e. WS-Security, WS-Reliability, WS-Transaction, WS-Addressing) which is commonly referred to as the WS-* standards. Interestingly the WS-* stack happens to be complex enough that they happen to be the only companies able to supply the complying software and tooling to support it, which funnily enough works seamlessly with their expensive servers.

It does seem that Microsoft, being the fashionable technology company they are don’t have all their eggs in the WS-* bucket. Realizing the current criticisms on their current technology stack, they have explored a range of other web service technologies namely WCF Data Services, WCF RIA Services and now their current favourite OData. The last of which I expect to see all their previous resource efforts in WS-* to be transferred into promoting this new Moniker. On the surface OData seems to be a very good ‘enabling technology’ that is doing a good job incorporating every good technology BUZZ-word it can (i.e. REST, ATOM, JSON). It is also being promoted as ‘clickbox driven development’ technology (which I’ll be eagerly awaiting to see the sticker for :) .

Catering for drag n’ drop developers and being able to create web services with a checkbox is a double-edged sword which I believe encourages web service development anti-patterns that run contra to SOA-style (which I will cover in a separate post). Just so everyone knows the latest push behind OData technology is to give you more reasons to use Azure (Microsoft’s cloud computing effort).

## POX to the rescue?

For the pragmatic programmer it’s becoming a hard task to follow the WS-* stack and still be able to get any work done. For what appears to be a growing trend, a lot of developers have taken the best bits from SOAP and WSDL and combined them in what is commonly referred to as POX or REST+POX. Basically this is Plain Old Xml over HTTP and REST-like urls. In this case a lot of the cruft inside a WSDL can be reduced to a simple XSD and a url. The interesting part about POX is that although there seems to be no formal spec published, a lot of accomplished web service developers have ultimately ended up at the same solution. The advantages this has over SOAP are numerous many of which are the same reasons that have made HTTP+XML ubiquitous. It is a lot simpler, smaller and faster at both development and runtime performance – while at the same time retaining a strongly-typed API (which is one of the major benefits of SOAP). Even though it’s lacking a formal API, it can be argued that POX is still more interoperable than SOAP as clients no longer require a SOAP client to consume the web service and can access it simply with a standard Web Client and XML parser present in most programming environments, even most browsers.

## And then there was JSON

One of the major complaints of XML is that it’s too verbose, which given a large enough dataset consumes a lot of bandwidth. It is also a lot stricter than a lot of people would like and given the potential for an XML document to be composed from many different namespaces and for a type to have both elements and attributes – it is not an ideal fit for most programming models. As a result of this, parsing XML can be quite cumbersome especially inside of a browser. A popular format which is seeking to overcome both of these problems and is now the preferred serialization format for AJAX applications is JSON. It is very simple to parse and maps perfectly to a JavaScript object, it is also safe format which is the reason why it’s chosen over pure JavaScript. It’s also a more ‘dynamic’ and resilient format than XML meaning that adding new or renaming existing elements or their types will not break the de-serialization routine as there is no formal spec to adhere to which is both and advantage and disadvantage. Unfortunately even though it’s a smaller, more simple format it is actually deceptively slower to de/serialize than XML using the available .NET libraries based on the available benchmarks. This performance gap is more likely due to the amount of effort Microsoft has put into their XML DataContractSerializer than a deficiency of the format itself as my effort of developing a JSON-like serialization format is both smaller than JSON and faster than XML – the best of both worlds.

## Service Stack’s new JSV Format

The latest endpoint to be added to Service Stack, is JSV the serialization format of Service Stack’s POCO TypeSerializer. It’s a JSON inspired format that uses CSV-style escaping for the least overhead and optimal performance.

With the interest of creating high-performance web services and not satisfied with the performance or size of existing XML and JSON serialization formats, TypeSerializer was created with a core goal to create the most compact and fastest text-serializer for .NET. In this mission, it has succeeded as it is now both 5.3x quicker than the leading .NET JSON serializer whilst being 2.6x smaller than the equivalent XML format.

TypeSerializer was developed from experience taking the best features of serialization formats it looks to replace. It has many other features that sets it apart from existing formats which makes it the best choice for serializing any .NET POCO object.

- Fastest and most compact text-serializer for .NET
- Human readable and writeable, self-describing text format
- Non-invasive and configuration-free
- Resilient to schema changes (focused on deserializing as much as possible without error)
- Serializes / De-serializes any .NET data type (by convention)
- Supports custom, compact serialization of structs by overriding ToString() and static T Parse(string) methods
- Can serialize inherited, interface or ‘late-bound objects’ data types
- Respects opt-in DataMember custom serialization for DataContract DTO types.

For these reasons it is the preferred choice to transparently store complex POCO types for OrmLite (in text blobs), POCO objects with ServiceStacks’ C# Redis Client or the optimal serialization format in .NET to .NET web services.
Oops, something went wrong.

0 comments on commit 59e0ad0

Please sign in to comment.