more blog post fix up since backup from ghost

chekkan · Apr 12, 2022 · c195c7d · c195c7d
1 parent 1ea59de
commit c195c7d
Show file tree

Hide file tree

Showing 9 changed files with 480 additions and 231 deletions.
diff --git a/_drafts/migrate-blog-to-jekyll-and-hosted-on-github-pages.markdown b/_drafts/migrate-blog-to-jekyll-and-hosted-on-github-pages.markdown
@@ -54,18 +54,33 @@ jekyll's default post format `yyyy-mm-dd-title` urls. Therefore, I had to
 manually set the `permalink` front matter attribute. This wasn't so bad as I 
 only had around 34 blog posts. 
 
+Also, posts with links to other posts ended up with `__GHOST_URL__` prefix. 
+e.g. `[Part 2 - Setting up Kibana Service](__GHOST_URL__/setting-up-elasticsearch-cluster-on-kubernetes-part-2-kibana/)`.
+Therefore, make sure to find and replace these ones manually to use `post_url`
+liquid tags. 
+e.g. `[Part 2 - Setting up Kibana Service]({{ '{%' }} post_url 2018-02-13-setting-up-elasticsearch-cluster-on-kubernetes-part-2-kibana %})`.
+
 ### Code blocks
 I also had to also add the correct `highlight language` tag for my code blocks 
 in existing posts.
 
 ```
 {% raw %}
 {% highlight csharp %}
-...
+public int Add(int a, int b) => a + b;
 {% endhighlight %}
 {% endraw %}
 ```
 
+I later found out that jekyll can also make use of 
+[GitHub Fenced Code Blocks][gh_code] syntax as well.
+
+<pre class='code'>
+<code>``` javascript
+const add = (a, b) => a + b;
+```</code>
+</pre>
+
 ### Images
 [jekyll_ghost_importer][1] also imports the images as html tags, and with fixed 
 width and height which looks streched out in Jekyll with minima theme. I had to 
@@ -81,4 +96,5 @@ have to stop migrating the blog and stick to a platform for a substantial
 amount of time. And I am hopefull that this might be it. 
 
 
-[1]: <https://github.com/eloyesp/jekyll_ghost_importer>
+[1]: <https://github.com/eloyesp/jekyll_ghost_importer>
+[gh_code]: <https://help.github.com/articles/creating-and-highlighting-code-blocks/>
diff --git a/_posts/2017-07-25-release-management-service-and-team-build.markdown b/_posts/2017-07-25-release-management-service-and-team-build.markdown
@@ -1,6 +1,7 @@
 ---
 layout: post
 title: Release Management Service and VSTS Team Build
+permalink: release-management-service-and-team-build
 date: '2017-07-25 22:38:00'
 tags:
 - vsts
@@ -9,13 +10,20 @@ tags:
 - azure-devops
 ---
 
-I’ve started working on a new project at my company and it is a massive project compared to anything I have done previously. We are not short on technologies in this project starting with an angular web site all the way to using elastic search. One of the challenges we faced daily was releasing all the different systems that is part of project into dev, test, staging and production.
+I’ve started working on a new project at my company and it is a massive project 
+compared to anything I have done previously. We are not short on technologies 
+in this project starting with an angular web site all the way to using elastic 
+search. One of the challenges we faced daily was releasing all the different 
+systems that is part of project into dev, test, staging and production.
 
-This is a blog post explaining the different challenges I faced and going through step by step each challenges.
+This is a blog post explaining the different challenges I faced and going 
+through step by step each challenges.
 
-Let me be clear about the different technologies and environments used in this walk through so that it is clear for you.
+Let me be clear about the different technologies and environments used in this 
+walk through so that it is clear for you.
 
-1. Visual Studio Team Services - At the time of this writing the latest release for visual studio [13th April 2016](https://www.visualstudio.com/en-us/news/2016-apr-13-vso) had came out.
+1. Visual Studio Team Services - At the time of this writing the latest release 
+   for Sisual Studio Online [13th April 2016][apr-13-vso] had came out.
 2. GIT - for the source control
 3. Angular with Typescript on the front end.
 4. ASP.NET WebAPI 2 on the server
@@ -24,13 +32,28 @@ Let me be clear about the different technologies and environments used in this w
 7. Elastic Search
 8. SQL Server
 
-The challenge was how to automate the full deployment of all these various technologies into deferent environment that was repeatable and to deliver our code that was predictable.
+The challenge was how to automate the full deployment of all these various 
+technologies into deferent environment that was repeatable and to deliver our 
+code that was predictable.
 
-My experience before starting this with regards to continuos delivery and operations were zero to none.
+My experience before starting this with regards to continuos delivery and 
+operations were zero to none.
 
-At first, I had spend some time looking into PowerShell DSC. It seems promising as VSTS at the time had Release Management Application which only supported PowerShell DSC. It wasn’t too long into PowerShell DSC that I ran into some difficulties with PowerShell DSC. To name a few, when I first started, there weren’t a lot of built in Resources that I could use. There were a few which I could get from the inter webs, but it was difficult to get it downloaded into the machine itself. Such as the IIS Site resources, xWebSite resource, xWebAdministrator resource, etc.
+At first, I had spend some time looking into PowerShell DSC. It seems promising 
+as VSTS at the time had Release Management Application which only supported 
+PowerShell DSC. It wasn’t too long into PowerShell DSC that I ran into some 
+difficulties with PowerShell DSC. To name a few, when I first started, there 
+weren’t a lot of built in Resources that I could use. There were a few which I 
+could get from the inter webs, but it was difficult to get it downloaded into 
+the machine itself. Such as the IIS Site resources, xWebSite resource, 
+xWebAdministrator resource, etc.
 
-The error messages provided by PowerShell DSC were really difficult to diagnose. Finding the answers on google searches were not straight forward etc. I finally decided that PowerShell wasn’t the way forward and that we needed to rely on something else to get the job done.
+The error messages provided by PowerShell DSC were really difficult to 
+diagnose. Finding the answers on google searches were not straight forward etc. 
+I finally decided that PowerShell wasn’t the way forward and that we needed to 
+rely on something else to get the job done.
 
-I had a few weeks to go down this route before I knew Microsoft was bringing release management into their cloud web portal.
+I had a few weeks to go down this route before I knew Microsoft was bringing 
+release management into their cloud web portal.
 
+[apr-13-vso]: <https://www.visualstudio.com/en-us/news/2016-apr-13-vso>
diff --git a/_posts/2017-07-30-ingesting-data-into-elasticsearch-with-logstash.markdown b/_posts/2017-07-30-ingesting-data-into-elasticsearch-with-logstash.markdown
@@ -1,6 +1,7 @@
 ---
 layout: post
 title: Ingesting data from Oracle DB into Elasticsearch with Logstash
+permalink: ingesting-data-into-elasticsearch-with-logstash
 date: '2017-07-30 00:19:00'
 tags:
 - elasticsearch
@@ -10,81 +11,127 @@ tags:
 - ansible
 ---
 
-Alternative to Logstash was the [Elasticsearch JDBC tool](https://github.com/jprante/elasticsearch-jdbc). Which at the time of writing was using port `9300` for transfering data. There were talks of not exposing this port externally in future releases of elaticsearch and hence we went with logstash.
+Alternative to Logstash was the [Elasticsearch JDBC tool][es_jdbc]. Which at 
+the time of writing was using port `9300` for transfering data. There were 
+talks of not exposing this port externally in future releases of elaticsearch 
+and hence we went with logstash.
 
 ## Setup
 
-- The way we have setup logstash and elasticsearch cluster at present is by using [Ansible](https://www.ansible.com/).
-- We have one vm with logstash installed which can connect to the elasticsearch cluster.
-- [ReadonlyRest](https://readonlyrest.com/) plugin is used for managing access for our cluster.
-- Used the [JDBC plugin](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html) in order to query for the data with [elasticsearch output plugin](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html).
-- Use a cron job for scheduling the logstash to run on a schedule. Our schedule is once every hour.
+- The way we have setup logstash and elasticsearch cluster at present is by 
+  using [Ansible][ansible].
+- We have one vm with logstash installed which can connect to the elasticsearch 
+  cluster.
+- [ReadonlyRest][ror] plugin is used for managing access for our cluster.
+- Used the [JDBC plugin][jdbc] in order to query for the data with 
+  [elasticsearch output plugin][esop].
+- Use a cron job for scheduling the logstash to run on a schedule. Our schedule 
+  is once every hour.
 
-As of logstash version 5.0, there is an option to enable [http compression](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#_http_compression) for **requests** , so make sure to take advantage of this. As we saw a reduction of up to 10 fold in the data size.
+As of logstash version 5.0, there is an option to enable 
+[http compression][htcomp] for **requests** , so make sure to take advantage of 
+this. As we saw a reduction of up to 10 fold in the data size.
 
 ## Updates
 
-There were two options for getting the updates from oracle db whilst using the JDBC input plugin. **Option 1:** Modify the job which insert or updates each table that we are ingesting with a `lastupdated` field. The script that would run at our schedule of every one hour would then query the elasticsearch index for the `max_date` on the index and pass it to the sql thats run by logstash jdbc plugin. **Option 2:** Use the `sql_last_value` plugin parameter which will persist the `sql_last_value` parameter in the form of a metadata file stored in the configured `last_run_metadata_path`. Upon query execution, this file will be updated with the current value of `sql_last_value`. In our case, this meant that we will need to use an insert or update timestamp in our table.
-
-Primary key in the oracle db table is used as the document id in elasticsearch. This means that each updated document will correctly override the document in elasticsearch.
-
-    output {
-      elasticsearch {
-        hosts => ${HOST_STRING}
-        index => "${ES_INDEX}"
-        document_id => "%{${ES_DOC_ID}}"
-        document_type => "${INDEX_TYPE}"
-        flush_size => 1000
-        http_compression => true
-      }
-    }
+There were two options for getting the updates from oracle db whilst using the 
+JDBC input plugin. **Option 1:** Modify the job which insert or updates each 
+table that we are ingesting with a `lastupdated` field. The script that would 
+run at our schedule of every one hour would then query the elasticsearch index 
+for the `max_date` on the index and pass it to the sql thats run by logstash 
+jdbc plugin. **Option 2:** Use the `sql_last_value` plugin parameter which will 
+persist the `sql_last_value` parameter in the form of a metadata file stored in 
+the configured `last_run_metadata_path`. Upon query execution, this file will 
+be updated with the current value of `sql_last_value`. In our case, this meant 
+that we will need to use an insert or update timestamp in our table.
+
+Primary key in the oracle db table is used as the document id in elasticsearch. 
+This means that each updated document will correctly override the document in 
+elasticsearch.
+
+{% highlight ruby %}
+
+output {
+  elasticsearch {
+    hosts => ${HOST_STRING}
+    index => "${ES_INDEX}"
+    document_id => "%{${ES_DOC_ID}}"
+    document_type => "${INDEX_TYPE}"
+    flush_size => 1000
+    http_compression => true
+  }
+}
+
+{% endhighlight %}
 
 ## Transform data
 
 Make use of filters in order to do basic data transformations.
 
 ### Transform table column value to object
 
-    mutate {
-        rename => { "address.line1" => "[address][line1]" }
-        rename => { "address.line2" => "[address][line2]" }
-    }
+{% highlight ruby %}
+
+mutate {
+    rename => { "address.line1" => "[address][line1]" }
+    rename => { "address.line2" => "[address][line2]" }
+}
+
+{% endhighlight %}
 
 ### Covert comma delimeted field to array of string
 
-    ruby {
-        init => "require 'csv'"
-        code => "['urls'].each { |type|
-            if event.include?(type) then
-                if event.get(type) == nil || event.get(type) == 'null' then
+{% highlight ruby %}
+
+ruby {
+    init => "require 'csv'"
+    code => "['urls'].each { |type|
+        if event.include?(type) then
+            if event.get(type) == nil || event.get(type) == 'null' then
+                event.remove(type)
+            else
+                # bin data if not valid CSV
+                begin
+                    event.set(type, CSV.parse(event.get(type))[0])
+                rescue
                     event.remove(type)
-                else
-                    # bin data if not valid CSV
-                    begin
-                        event.set(type, CSV.parse(event.get(type))[0])
-                    rescue
-                        event.remove(type)
-                    end
                 end
             end
-        }"
-    }
+        end
+    }"
+}
+
+{% endhighlight %}
 
 ## Improvements
 
-The setup described in this article doesn’t work well if we need to also remove deleted entries. Consider using a column in our view to indicate if a field was removed or not. But that only works for “soft-deletes” in database.
+The setup described in this article doesn’t work well if we need to also remove 
+deleted entries. Consider using a column in our view to indicate if a field was 
+removed or not. But that only works for “soft-deletes” in database.
 
-Move towards using a bus queuing system for ingestion. One project by linkedin that caught my attention that supports oracle db as source for ingestion was [databus](https://github.com/linkedin/databus). But, haven’t managed to get it setup locally (poor documentation at the time of writing).
+Move towards using a bus queuing system for ingestion. One project by linkedin 
+that caught my attention that supports oracle db as source for ingestion was 
+[databus][lkndb]. But, haven’t managed to get it setup locally (poor 
+documentation at the time of writing).
 
-Full re-index is currently a manual process, even though we a script to perform full re-index.
+Full re-index is currently a manual process, even though we a script to perform 
+full re-index.
 
 ## Further Reading
 
-- 📖[bottled water: real-time integration of postgresql and kafka](https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/)
-- 📖[data pipeline evolution at linkedin on a few pictures](http://getindata.com/data-pipeline-evolution-at-linkedin-on-a-few-pictures)
-- 🎥[change data capture: the magic wand we forgot](https://www.youtube.com/watch?v=ZAZJqEKUl3U)
+- 📖 [bottled water: real-time integration of postgresql and kafka](https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/)
+- 📖 [data pipeline evolution at linkedin on a few pictures](http://getindata.com/data-pipeline-evolution-at-linkedin-on-a-few-pictures)
+- 🎥 [change data capture: the magic wand we forgot](https://www.youtube.com/watch?v=ZAZJqEKUl3U)
 
 _Image credit:_
 
 - [https://flic.kr/p/8wuFEJ](https://flic.kr/p/8wuFEJ)
 - [https://creativecommons.org/licenses/by-nc/2.0/](https://creativecommons.org/licenses/by-nc/2.0/)
+
+[es_jdbc]: <https://github.com/jprante/elasticsearch-jdbc>
+[ansible]: <https://www.ansible.com/>
+[ror]: <https://readonlyrest.com/>
+[jdbc]: <https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html>
+[esop]: <https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html>
+[htcomp]: <https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#_http_compression>
+[lkndb]: <https://github.com/linkedin/databus>
diff --git a/.../2017-11-30-http-fetch-response-headers-with-serverless-and-aws-lambda.markdown b/.../2017-11-30-http-fetch-response-headers-with-serverless-and-aws-lambda.markdown
@@ -1,7 +1,8 @@
 ---
 layout: post
-title: Access response headers in HTTP Fetch API with Serverless Framework and AWS
-  Lambda
+title: Access response headers in HTTP Fetch API with Serverless Framework and 
+  AWS Lambda
+permalink: http-fetch-response-headers-with-serverless-and-aws-lambda
 date: '2017-11-30 00:23:00'
 tags:
 - aws
@@ -11,47 +12,62 @@ tags:
 - node-js
 ---
 
-In order to access response headers such as `Location` in HTTP Fetch api whilst using Serverless Framework and AWS Lambda Functions with CORS enabled, you need to do the following.
+In order to access response headers such as `Location` in HTTP Fetch api whilst 
+using Serverless Framework and AWS Lambda Functions with CORS enabled, you need 
+to do the following.
 
 Make sure `cors` is set to `true` on `serverless.yml`
 
-    postUsers:
-     handler: handler.postUsers
-     events:
-       - http:
-         path: users
-         method: post
-         cors: true
+{% highlight yaml %}
+
+postUsers:
+  handler: handler.postUsers
+  events:
+    - http:
+      path: users
+      method: post
+      cors: true
+
+{% endhighlight %}
 
 Make sure in the response header, you are returning the following:
 
-    callback(null, {
-     statusCode: 201,
-     headers: {
-       "Access-Control-Allow-Origin": "*",
-       // Required for cookies, authorization headers with HTTPS
-       "access-control-allow-credentials": true,
-       "access-control-allow-headers": "Location",
-       "access-control-expose-headers": "Location",
-       Location: id
-     }
-    });
+{% highlight javascript %}
+
+callback(null, {
+  statusCode: 201,
+  headers: {
+    "Access-Control-Allow-Origin": "*",
+    // Required for cookies, authorization headers with HTTPS
+    "access-control-allow-credentials": true,
+    "access-control-allow-headers": "Location",
+    "access-control-expose-headers": "Location",
+    Location: id
+  }
+});
+
+{% endhighlight %}
 
 Now, you can access the header `location` from `fetch`.
 
-    this.httpClient
-      .fetch("/users", {
-        method: "post",
-        body: json({ username: "chekkan" })
-      })
-      .then(res => {
-        return res.headers.get("location");
-      });
+{% highlight javascript %}
+
+this.httpClient
+  .fetch("/users", {
+    method: "post",
+    body: json({ username: "chekkan" })
+  })
+  .then(res => {
+    return res.headers.get("location");
+  });
+
+{% endhighlight %}
 
 **References:**
 
 - [Serverless.yml CORS](https://serverless.com/framework/docs/providers/aws/events/apigateway#enabling-cors)
 - [Access-Control-Allow-Headers - Mozilla Developer](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Headers)
 
-_Photo by [Paul Buffington](https://unsplash.com/photos/Lwe2hbm5XKk?utmsource=unsplash&utmmedium=referral&utmcontent=creditCopyText) on [Unsplash](https://unsplash.com/?utmsource=unsplash&utmmedium=referral&utmcontent=creditCopyText)_
+_Photo by [Paul Buffington](https://unsplash.com/photos/Lwe2hbm5XKk?utmsource=unsplash&utmmedium=referral&utmcontent=creditCopyText) 
+on [Unsplash](https://unsplash.com/?utmsource=unsplash&utmmedium=referral&utmcontent=creditCopyText)_