# Data Masking Sample 
### Using StackOverflow2010 Database from BrentOzar.com

My task here is to identify the sensitivity of the database, so that I can capture that in SQL Data Catalog (my system of record for data classification), then use that metadata to define my masking operations before secondary use of that database.

In [1]:
-- Get a list of tables and views in the current database
SELECT table_catalog [database], table_schema [schema], table_name [name], table_type [type]
FROM [StackOverflow2010].INFORMATION_SCHEMA.TABLES;

database,schema,name,type
StackOverflow2010,dbo,Comments,BASE TABLE
StackOverflow2010,dbo,LinkTypes,BASE TABLE
StackOverflow2010,dbo,PostLinks,BASE TABLE
StackOverflow2010,dbo,Posts,BASE TABLE
StackOverflow2010,dbo,PostTypes,BASE TABLE
StackOverflow2010,dbo,Users,BASE TABLE
StackOverflow2010,dbo,Votes,BASE TABLE
StackOverflow2010,dbo,VoteTypes,BASE TABLE
StackOverflow2010,dbo,Badges,BASE TABLE


First, let's look at the Users table before masking. Some recognisable names here. Location is pretty specific in some cases, and AboutMe is sometimes very identifiable. WebsiteUrl definitely so. EmailHash is null in all cases. What if it started to be used though? This is something I would flag up for refactoring; if it's not used, let's not leave it in the schema. Sometimes there's some common-sense risk reduction that we can use this opportunity to implement.

In [8]:
SELECT Top(10) * FROM [StackOverflow2010].dbo.Users;

Id,AboutMe,Age,CreationDate,DisplayName,DownVotes,EmailHash,LastAccessDate,Location,Reputation,UpVotes,Views,WebsiteUrl,AccountId
-1,"<p>Hi, I'm not really a person.</p> <p>I'm a background process that helps keep this site clean!</p> <p>I do things like</p> <ul> <li>Randomly poke old unanswered questions every hour so they get some attention</li> <li>Own community questions and answers so nobody gets unnecessary reputation from them</li> <li>Own downvotes on spam/evil posts that get permanently deleted</li> <li>Own suggested edits from anonymous users</li> <li><a href=""http://meta.stackexchange.com/a/92006"">Remove abandoned questions</a></li> </ul>",,2008-07-31 00:00:00.000,Community,980920,,2008-08-26 00:16:53.810,on the server farm,1,274835,649,http://meta.stackexchange.com/,-1
1,"<p><a href=""http://www.codinghorror.com/blog/archives/001169.html"" rel=""nofollow"">Stack Overflow Valued Associate #00001</a></p> <p>Wondering how our software development process works? <a href=""http://www.youtube.com/watch?v=08xQLGWTSag"" rel=""nofollow"">Take a look!</a></p> <p>Find me <a href=""http://twitter.com/codinghorror"" rel=""nofollow"">on twitter</a>, or <a href=""http://www.codinghorror.com/blog"" rel=""nofollow"">read my blog</a>. Don't say I didn't warn you <em>because I totally did</em>.</p> <p>However, <a href=""http://www.codinghorror.com/blog/2012/02/farewell-stack-exchange.html"" rel=""nofollow"">I no longer work at Stack Exchange, Inc</a>. I'll miss you all. Well, <em>some</em> of you, anyway. :)</p>",,2008-07-31 14:22:31.287,Jeff Atwood,1309,,2018-08-29 02:34:22.893,"El Cerrito, CA",44300,3367,408587,http://www.codinghorror.com/blog/,1
2,"<p>Developer on the Stack Overflow team. Find me on</p> <p><a href=""http://www.twitter.com/SuperDalgas"" rel=""nofollow noreferrer"">Twitter</a> <br><br> <a href=""http://blog.stackoverflow.com/2009/05/welcome-stack-overflow-valued-associate-00003/"">Stack Overflow Valued Associate #00003</a></p>",,2008-07-31 14:22:31.287,Geoff Dalgas,88,,2018-08-23 17:31:56.427,"Corvallis, OR",3491,650,23966,http://stackoverflow.com,2
3,"<p><a href=""http://blog.stackoverflow.com/2009/01/welcome-stack-overflow-valued-associate-00002/"">Developer on the Stack Overflow team</a>.</p> <p>Was dubbed <strong>SALTY SAILOR</strong> by Jeff Atwood, as filth and flarn would oft-times fly when dealing with a particularly nasty bug!</p> <ul> <li>Twitter me: <a href=""http://twitter.com/jarrod_dixon"" rel=""nofollow noreferrer"">jarrod_dixon</a></li> <li>Email me: jarrod.m.dixon@gmail.com</li> </ul>",,2008-07-31 14:22:31.287,Jarrod Dixon,100,,2018-08-30 20:56:23.897,"Raleigh, NC, United States",13418,7285,24396,http://jarroddixon.com,3
4,"<p>I am:</p> <ul> <li>the co-founder and CEO of <a href=""http://stackoverflow.com"">Stack Overflow</a></li> <li>the co-founder of <a href=""http://www.fogcreek.com"" rel=""nofollow noreferrer"">Fog Creek Software</a></li> <li>the creator of <a href=""http://trello.com"" rel=""nofollow noreferrer"">Trello</a> (now owned by Atlassian)</li> </ul> <p>You can find me on my rarely-updated blog, <a href=""http://joelonsoftware.com"" rel=""nofollow noreferrer"">Joel on Software</a>.</p>",,2008-07-31 14:22:31.317,Joel Spolsky,96,,2018-08-14 22:18:15.227,"New York, NY",28768,797,73755,http://www.joelonsoftware.com/,4
5,"<p>Technical Evangelist at Microsoft, specializing in ASP.NET MVC.</p> <p>I don't use this site anymore because the moderators close or delete far too many of the useful questions.</p>",,2008-07-31 14:22:31.317,Jon Galloway,34,,2018-08-29 16:48:35.993,"San Diego, CA",39172,781,11700,http://weblogs.asp.net/jgalloway/,5
8,"<p>This is a puppet test account I use to validate ""regular user"" stuff on the site</p> <p>-- <a href=""http://stackoverflow.com/users/1/jeff-atwood"" rel=""nofollow"">Jeff Atwood</a>",,2008-07-31 21:33:24.057,Eggs McLaren,9,,2018-04-09 02:04:55.577,,942,12,6372,,6
9,<p>Independent software engineer</p>,,2008-07-31 21:35:26.517,Kevin Dente,4,,2018-08-30 18:18:03.423,"Oakland, CA",14337,46,4949,http://weblogs.asp.net/kdente,7
10,"<p>I'm not takin' my sneakers off! <br><br> Actually, <b>I'm a test account</b>, used to help debug problems here on StackOverflow.</p>",,2008-07-31 21:57:06.240,Sneakers O'Toole,0,,2018-06-08 05:11:12.523,"Morganton, North Carolina United States",101,0,3678,https://www.youtube.com/watch?v=OcSKd13mKUY,8
11,,,2008-08-01 00:59:11.147,Anonymous User,0,,2008-08-01 00:59:11.147,,1890,0,2123,,561854


Next, the Badges table.

In [4]:
SELECT Top(5) * FROM [StackOverflow for Masking].dbo.Badges;

Id,Name,UserId,Date
82946,Teacher,3718,2008-09-15 08:55:03.923
82947,Teacher,994,2008-09-15 08:55:03.957
82949,Teacher,3893,2008-09-15 08:55:03.957
82950,Teacher,4591,2008-09-15 08:55:03.957
82951,Teacher,5196,2008-09-15 08:55:03.957


Slightly odd normalisation decision perhaps, but not sensitive.
OK, leave that one, Votes next.

In [5]:
SELECT Top(5) * FROM [StackOverflow for Masking].dbo.Votes;

Id,PostId,UserId,BountyAmount,VoteTypeId,CreationDate
1,1,,,2,2008-07-31 00:00:00.000
2,3,,,2,2008-07-31 00:00:00.000
3,2,,,2,2008-07-31 00:00:00.000
4,4,,,2,2008-07-31 00:00:00.000
5,6,,,2,2008-07-31 00:00:00.000


Can't see anything there. If we've masked the users well, what they've voted for won't be very interesting. Or _will_ it? If I know the context, I might be able to do reidentification. Jon Skeet is famously prolific for example. One of those slightly tricky calls that requires business context and a risk assessment. Who is going to see the masked database? What would happen id they reidentified using their knowledge?

In [6]:
SELECT Top(5) * FROM [StackOverflow for Masking].dbo.Comments;

Id,CreationDate,PostId,Score,Text,UserId
1,2008-09-06 08:07:10.730,35314,36,not sure why this is getting downvoted -- it is correct! Double check it in your compiler if you don't believe him!,1
2,2008-09-06 08:09:52.330,35314,8,"Yeah, I didn't believe it until I created a console app - but good lord! Why would they give you the rope to hang yourself! I hated that about VB.NET - the OrElse and AndAlso keywords!",3
4,2008-09-06 08:42:16.980,35195,0,"I don't see an accepted answer now, I wonder how that got unaccepted. Incidentally, I would have marked an accepted answer based on the answers available at the time. Also, accepted doesn't mean Best :)",380
9,2008-09-06 12:26:30.060,47239,0,"Jonathan: Wow! Thank you for all of that, you did an amazing amount of work!",4550
10,2008-09-06 13:38:23.647,45651,6,It will help if you give some details of which database you are using as techniques vary.,242


Going to need to handle that text I think. Ok, Posts next.

In [3]:
SELECT Top(5) * FROM [StackOverflow2010].dbo.Posts;

Id,AcceptedAnswerId,AnswerCount,Body,ClosedDate,CommentCount,CommunityOwnedDate,CreationDate,FavoriteCount,LastActivityDate,LastEditDate,LastEditorDisplayName,LastEditorUserId,OwnerUserId,ParentId,PostTypeId,Score,Tags,Title,ViewCount
4,7,13,"<p>I want to use a track-bar to change a form's opacity.</p> <p>This is my code:</p> <pre><code>decimal trans = trackBar1.Value / 5000; this.Opacity = trans; </code></pre> <p>When I build the application, it gives the following error:</p> <blockquote>  <p>Cannot implicitly convert type <code>'decimal'</code> to <code>'double'</code>.</p> </blockquote> <p>I tried using <code>trans</code> and <code>double</code> but then the control doesn't work. This code worked fine in a past VB.NET project.</p>",,1,2012-10-31 16:42:47.213,2008-07-31 21:42:52.667,41,2018-07-02 17:55:27.247,2018-07-02 17:55:27.247,Rich B,6786713,8,0,1,573,<c#><floating-point><type-conversion><double><decimal>,Convert Decimal to Double?,37080
6,31,5,"<p>I have an absolutely positioned <code>div</code> containing several children, one of which is a relatively positioned <code>div</code>. When I use a <strong>percentage-based width</strong> on the child <code>div</code>, it collapses to '0' width on <a href=""http://en.wikipedia.org/wiki/Internet_Explorer_7"" rel=""noreferrer"">Internet&nbsp;Explorer&nbsp;7</a>, but not on Firefox or Safari.</p> <p>If I use <strong>pixel width</strong>, it works. If the parent is relatively positioned, the percentage width on the child works.</p> <ol> <li>Is there something I'm missing here?</li> <li>Is there an easy fix for this besides the <em>pixel-based width</em> on the child?</li> <li>Is there an area of the CSS specification that covers this?</li> </ol>",,0,,2008-07-31 22:08:08.620,10,2016-03-19 06:10:52.170,2016-03-19 06:05:48.487,Rich B,63550,9,0,1,256,<html><css><css3><internet-explorer-7>,Percentage width child element in absolutely positioned parent on Internet Explorer 7,16306
7,0,0,<p>An explicit cast to double like this isn't necessary:</p> <pre><code>double trans = (double) trackBar1.Value / 5000.0; </code></pre> <p>Identifying the constant as <code>5000.0</code> (or as <code>5000d</code>) is sufficient:</p> <pre><code>double trans = trackBar1.Value / 5000.0; double trans = trackBar1.Value / 5000d; </code></pre>,,0,,2008-07-31 22:17:57.883,0,2017-12-16 05:06:57.613,2017-12-16 05:06:57.613,,4020527,9,4,2,401,,,0
9,1404,64,"<p>Given a <code>DateTime</code> representing a person's birthday, how do I calculate their age in years? </p>",,7,2011-08-16 19:40:43.080,2008-07-31 23:40:59.743,399,2018-07-25 11:57:14.110,2018-04-21 17:48:14.477,Rich B,3956566,1,0,1,1743,<c#><.net><datetime>,How do I calculate someone's age in C#?,480476
11,1248,35,"<p>Given a specific <code>DateTime</code> value, how do I display relative time, like:</p> <ul> <li>2 hours ago</li> <li>3 days ago</li> <li>a month ago</li> </ul>",,3,2009-09-04 13:15:59.820,2008-07-31 23:55:37.967,529,2018-07-05 04:00:56.633,2017-06-04 15:51:19.780,user2370523,6479704,1,0,1,1348,<c#><datetime><time><datediff><relative-time-span>,Calculate relative time in C#,136033


LastEditorDisplayName, going on this list.
How about Body? This is a bit trickier; the body is in most cases a public post on a website about a technical matter. How could that be confidential?
But here's the rub; GDPR isn't necessarily about confidentiality. It's about taking a risk-based approach to the handling of personal data. The first thing we have to identify is whether it is personal. 
```
Personal data is information that relates to an identified or identifiable individual.

What identifies an individual could be as simple as a name or a number or could include other identifiers such as an IP address or a cookie identifier, or other factors.
```
https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/key-definitions/what-is-personal-data/

In [5]:
SELECT Top(10) * FROM [StackOverflow2010].dbo.Posts where Body like '%@%';

Id,AcceptedAnswerId,AnswerCount,Body,ClosedDate,CommentCount,CommunityOwnedDate,CreationDate,FavoriteCount,LastActivityDate,LastEditDate,LastEditorDisplayName,LastEditorUserId,OwnerUserId,ParentId,PostTypeId,Score,Tags,Title,ViewCount
18,0,0,"<p>For a table like this:</p> <pre><code>CREATE TABLE binary_data (  id INT(4) NOT NULL AUTO_INCREMENT PRIMARY KEY,  description CHAR(50),  bin_data LONGBLOB,  filename CHAR(50),  filesize CHAR(50),  filetype CHAR(50) ); </code></pre> <p>Here is a PHP example:</p> <pre><code>&lt;?php  // store.php3 - by Florian Dittmer &lt;dittmer@gmx.net&gt;  // Example php script to demonstrate the storing of binary files into  // an sql database. More information can be found at http://www.phpbuilder.com/ ?&gt; &lt;html&gt;  &lt;head&gt;&lt;title&gt;Store binary data into SQL Database&lt;/title&gt;&lt;/head&gt;  &lt;body&gt;  &lt;?php  // Code that will be executed if the form has been submitted:  if ($submit) {  // Connect to the database (you may have to adjust  // the hostname, username or password).  mysql_connect(""localhost"", ""root"", ""password"");  mysql_select_db(""binary_data"");  $data = mysql_real_escape_string(fread(fopen($form_data, ""r""), filesize($form_data)));  $result = mysql_query(""INSERT INTO binary_data (description, bin_data, filename, filesize, filetype) "".  ""VALUES ('$form_description', '$data', '$form_data_name', '$form_data_size', '$form_data_type')"");  $id= mysql_insert_id();  print ""&lt;p&gt;This file has the following Database ID: &lt;b&gt;$id&lt;/b&gt;"";  mysql_close();  } else {  // else show the form to submit new data:  ?&gt;  &lt;form method=""post"" action=""&lt;?php echo $PHP_SELF; ?&gt;"" enctype=""multipart/form-data""&gt;  File Description:&lt;br&gt;  &lt;input type=""text"" name=""form_description"" size=""40""&gt;  &lt;input type=""hidden"" name=""MAX_FILE_SIZE"" value=""1000000""&gt;  &lt;br&gt;File to upload/store in database:&lt;br&gt;  &lt;input type=""file"" name=""form_data"" size=""40""&gt;  &lt;p&gt;&lt;input type=""submit"" name=""submit"" value=""submit""&gt;  &lt;/form&gt;  &lt;?php  }  ?&gt;  &lt;/body&gt; &lt;/html&gt; </code></pre>",,2,,2008-08-01 05:12:44.193,0,2016-06-02 05:56:26.060,2016-06-02 05:56:26.060,Jeff Atwood,126039,0,17,2,56,,,0
27,0,0,"<p>@jeff</p> <p>IMHO yours seems a little long. However it does seem a little more robust with support for ""yesterday"" and ""years"". But in my experience when this is used the person is most likely to view the content in the first 30 days. It is only the really hardcore people that come after that. So that is why I usually elect to keep this short and simple.</p> <p>This is the method I am currently using on one of my websites. This only returns a relative day, hour, time. And then the user has to slap on ""ago"" in the output.</p> <pre><code>public static string ToLongString(this TimeSpan time) {  string output = String.Empty;  if (time.Days &gt; 0)  output += time.Days + "" days "";  if ((time.Days == 0 || time.Days == 1) &amp;&amp; time.Hours &gt; 0)  output += time.Hours + "" hr "";  if (time.Days == 0 &amp;&amp; time.Minutes &gt; 0)  output += time.Minutes + "" min "";  if (output.Length == 0)  output += time.Seconds + "" sec"";  return output.Trim(); } </code></pre>",,0,2009-09-04 13:15:59.820,2008-08-01 12:17:19.357,0,2008-08-01 13:16:49.127,2008-08-01 13:16:49.127,Nick Berardi,17,17,11,2,29,,,0
73,0,0,"<p>@Jax: The <code>extern ""C""</code> thing matters, very very much. If a header file doesn't have one, then (unless it's a C++-only header file), you would have to enclose your <code>#include</code> with it:</p> <pre><code>extern ""C"" { #include &lt;sys/socket.h&gt; // include other similarly non-compliant header files } </code></pre> <p>Basically, anytime where a C++ program wants to link to C-based facilities, the <code>extern ""C""</code> is vital. In practical terms, it means that the names used in external references will not be mangled, like normal C++ names would. <a href=""http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html"" rel=""noreferrer"">Reference.</a></p>",,0,,2008-08-01 13:40:16.443,0,2008-08-01 13:54:25.510,2008-08-01 13:54:25.510,Chris Jester-Young,13,13,25,2,18,,,0
109,2585,2,"<p>Recently our site has been deluged with the resurgence of the <a href=""https://en.wikipedia.org/wiki/Asprox_botnet"" rel=""noreferrer"">Asprox botnet</a> <a href=""http://en.wikipedia.org/wiki/SQL_injection"" rel=""noreferrer"">SQL injection</a> attack. Without going into details, the attack attempts to execute SQL code by encoding the <a href=""http://en.wikipedia.org/wiki/Transact-SQL"" rel=""noreferrer"">T-SQL</a> commands in an ASCII encoded BINARY string. It looks something like this:</p> <pre><code>DECLARE%20@S%20NVARCHAR(4000);SET%20@S=CAST(0x44004500...06F007200%20AS%20NVARCHAR(4000));EXEC(@S);-- </code></pre> <p>I was able to decode this in SQL, but I was a little wary of doing this since I didn't know exactly what was happening at the time.</p> <p>I tried to write a simple decode tool, so I could decode this type of text without even touching <a href=""http://en.wikipedia.org/wiki/Microsoft_SQL_Server"" rel=""noreferrer"">SQL&nbsp;Server</a>. The main part I need decoded is:</p> <pre><code>CAST(0x44004500...06F007200 AS NVARCHAR(4000)) </code></pre> <p>I've tried all of the following commands with no luck:</p> <pre><code>txtDecodedText.Text =  System.Web.HttpUtility.UrlDecode(txtURLText.Text); txtDecodedText.Text =  Encoding.ASCII.GetString(Encoding.ASCII.GetBytes(txtURLText.Text)); txtDecodedText.Text =  Encoding.Unicode.GetString(Encoding.Unicode.GetBytes(txtURLText.Text)); txtDecodedText.Text =  Encoding.ASCII.GetString(Encoding.Unicode.GetBytes(txtURLText.Text)); txtDecodedText.Text =  Encoding.Unicode.GetString(Convert.FromBase64String(txtURLText.Text)); </code></pre> <p>What is the proper way to translate this encoding without using SQL Server? Is it possible? I'll take VB.NET code since I'm familiar with that too.</p> <hr> <p>Okay, I'm sure I'm missing something here, so here's where I'm at.</p> <p>Since my input is a basic string, I started with just a snippet of the encoded portion - 4445434C41 (which translates to DECLA) - and the first attempt was to do this...</p> <pre><code>txtDecodedText.Text = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(txtURL.Text)); </code></pre> <p>...and all it did was return the exact same thing that I put in, since it converted each character into is byte.</p> <p>I realized that I need to parse each two characters into a byte manually since I don't know of any methods yet that will do that, so now my little decoder looks something like this:</p> <pre><code>while (!boolIsDone) {  bytURLChar = byte.Parse(txtURLText.Text.Substring(intParseIndex, 2));  bytURL[intURLIndex] = bytURLChar;  intParseIndex += 2;  intURLIndex++;  if (txtURLText.Text.Length - intParseIndex &lt; 2)  {  boolIsDone = true;  } } txtDecodedText.Text = Encoding.UTF8.GetString(bytURL); </code></pre> <p>Things look good for the first couple of pairs, but then the loop balks when it gets to the ""4C"" pair and says that the string is in the incorrect format.</p> <p>Interestingly enough, when I step through the debugger and to the GetString method on the byte array that I was able to parse up to that point, I get "",-+"" as the result.</p> <p>How do I figure out what I'm missing - do I need to do a ""direct cast"" for each byte instead of attempting to parse it?</p>",,3,,2008-08-01 15:23:05.190,0,2015-01-08 17:02:44.503,2015-01-08 16:27:49.863,Jeff Atwood,63550,71,0,1,51,<c#><sql><vb.net><ascii><hex>,Decoding T-SQL CAST in C#/VB.NET,4581
166,0,0,"<p>You can do it by adding the following anywhere in your code </p> <pre><code>$Id:$ </code></pre> <p>So for example @Jeff did:</p> <pre><code>&lt;div id=""svnrevision""&gt;svn revision: $Id:$&lt;/div&gt; </code></pre> <p>and when checked in the server replaced $Id:$ with the current revision number. I also <a href=""http://www.compuphase.com/svnrev.htm"" rel=""nofollow noreferrer"">found this reference</a>.</p> <p>There is also <strong>$Date:$</strong>, <strong>$Rev:$</strong>, <strong>$Revision:$</strong></p>",,7,,2008-08-01 18:08:16.590,0,2015-12-29 06:29:47.283,2015-12-29 06:29:47.283,,3787519,17,163,2,25,,,0
274,0,0,"<p>@Ryan</p> <p>That's correct but it actually makes the tags with the least number, larger. This code has been tested:</p> <pre><code>$artist = array(""the roots"",""michael jackson"",""billy idol"",""more"",""and more"",""and_YET_MORE""); $count = array(5,3,9,1,1,3); $highest = max($count); for ($x = 0; $x &lt; count($artist); $x++) {  $normalized = ($highest - $count[$x]+1) / $highest;  $heading = ceil($normalized * 6); // 6 heading types  echo ""&lt;h$heading&gt;{$artist[$x]}&lt;/h$heading&gt;""; } </code></pre>",,0,,2008-08-01 23:58:35.650,0,2008-08-01 23:58:35.650,,,0,40,227,2,7,,,0
396,0,0,"<p>Use email authentication methods, such as <a href=""http://en.wikipedia.org/wiki/Sender_Policy_Framework"" rel=""noreferrer"">SPF</a>, and <a href=""https://help.ubuntu.com/community/Postfix/DKIM"" rel=""noreferrer"">DKIM</a> to prove that your emails and your domain name belong together, and to prevent spoofing of your domain name. The SPF website includes a wizard to generate the DNS information for your site.</p> <p><a href=""http://remote.12dt.com/"" rel=""noreferrer"">Check</a> your <a href=""http://en.wikipedia.org/wiki/Reverse_DNS_lookup"" rel=""noreferrer"">reverse DNS</a> to make sure the IP address of your mail server points to the domain name that you use for sending mail.</p> <p>Make sure that the IP-address that you're using is <a href=""http://www.spamhaus.org/lookup.lasso"" rel=""noreferrer"">not on a blacklist</a></p> <p>Make sure that the reply-to address is a valid, existing address.</p> <p>Use the full, real name of the addressee in the To field, not just the email-address (e.g. <code>""John Smith"" &lt;john@blacksmiths-international.com&gt;</code> ).</p> <p>Monitor your abuse accounts, such as abuse@yourdomain.com and postmaster@yourdomain.com. That means - make sure that these accounts exist, read what's sent to them, and act on complaints.</p> <p>Finally, make it <strong>really</strong> easy to unsubscribe. Otherwise, your users will unsubscribe by pressing the <strong>spam</strong> button, and that will affect your reputation.</p> <p>That said, getting Hotmail to accept your emails remains a black art.</p>",,7,,2008-08-02 10:21:54.300,0,2017-04-20 16:28:04.430,2017-04-20 16:28:04.430,Paul Tomblin,246246,136,371,2,323,,,0
555,0,0,"<h1>Definitive Article</h1> <h3>Sending credentials</h3> <p>The only practical way to send credentials 100% securely is by using <a href=""http://en.wikipedia.org/wiki/SSL"" rel=""noreferrer"">SSL</a>. Using JavaScript to hash the password is not safe. Common pitfalls for client-side password hashing:</p> <ul> <li>If the connection between the client and server is unencrypted, everything you do is <a href=""https://stackoverflow.com/questions/14907581/ssl-and-man-in-the-middle-misunderstanding"">vulnerable to man-in-the-middle attacks</a>. An attacker could replace the incoming javascript to break the hashing or send all credentials to their server, they could listen to client responses and impersonate the users perfectly, etc. etc. SSL with trusted Certificate Authorities is designed to prevent MitM attacks.</li> <li>The hashed password received by the server is <a href=""https://security.stackexchange.com/questions/45254/owasp-recommendation-on-client-side-password-hashing"">less secure</a> if you don't do additional, redundant work on the server.</li> </ul> <p>There's another secure method called <strong>SRP</strong>, but it's patented (although it is <a href=""http://srp.stanford.edu/license.txt"" rel=""noreferrer"">freely licensed</a>) and there are few good implementations available.</p> <h3>Storing passwords</h3> <p>Don't ever store passwords as plaintext in the database. Not even if you don't care about the security of your own site. Assume that some of your users will reuse the password of their online bank account. So, store the hashed password, and throw away the original. And make sure the password doesn't show up in access logs or application logs. OWASP <a href=""https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet#Impose_infeasible_verification_on_attacker"" rel=""noreferrer"">recommends the use of Argon2</a> as your first choice for new applications. If this is not available, PBKDF2 or scrypt should be used instead. And finally if none of the above are available, use bcrypt.</p> <p>Hashes by themselves are also insecure. For instance, identical passwords mean identical hashes--this makes hash lookup tables an effective way of cracking lots of passwords at once. Instead, store the <strong>salted</strong> hash. A salt is a string appended to the password prior to hashing - use a different (random) salt per user. The salt is a public value, so you can store them with the hash in the database. See <a href=""http://www.codeproject.com/Articles/704865/Salted-Password-Hashing-Doing-it-Right"" rel=""noreferrer"">here</a> for more on this.</p> <p>This means that you can't send the user their forgotten passwords (because you only have the hash). Don't reset the user's password unless you have authenticated the user (users must prove that they are able to read emails sent to the stored (and validated) email address.)</p> <h3>Security questions</h3> <p>Security questions are insecure - avoid using them. Why? Anything a security question does, a password does better. Read <strong><em>PART III: Using Secret Questions</em></strong> in <a href=""http://srp.stanford.edu/license.txt"" rel=""noreferrer"">@Jens Roland answer</a> here in this wiki.</p> <h3>Session cookies</h3> <p>After the user logs in, the server sends the user a session cookie. The server can retrieve the username or id from the cookie, but nobody else can generate such a cookie (TODO explain mechanisms).</p> <p><a href=""http://en.wikipedia.org/wiki/Session_hijacking"" rel=""noreferrer"">Cookies can be hijacked</a>: they are only as secure as the rest of the client's machine and other communications. They can be read from disk, sniffed in network traffic, lifted by a cross-site scripting attack, phished from a poisoned DNS so the client sends their cookies to the wrong servers. Don't send persistent cookies. Cookies should expire at the end of the client session (browser close or leaving your domain).</p> <p>If you want to autologin your users, you can set a persistent cookie, but it should be distinct from a full-session cookie. You can set an additional flag that the user has auto-logged in, and needs to login for real for sensitive operations. This is popular with shopping sites that want to provide you with a seamless, personalized shopping experience but still protect your financial details. For example, when you return to visit Amazon, they show you a page that looks like you're logged in, but when you go to place an order (or change your shipping address, credit card etc.), they ask you to confirm your password.</p> <p>Financial web sites such as banks and credit cards, on the other hand, only have sensitive data and should not allow auto-login or a low-security mode.</p> <h3>List of external resources</h3> <ul> <li><a href=""http://pdos.csail.mit.edu/papers/webauth:sec10.pdf"" rel=""noreferrer"">Dos and Don'ts of Client Authentication on the Web (PDF)</a><br> 21 page academic article with many great tips. </li> <li><a href=""http://news.ycombinator.com/item?id=205572"" rel=""noreferrer"">Ask YC: Best Practices for User Authentication</a><br> Forum discussion on the subject </li> <li><a href=""http://www.codinghorror.com/blog/archives/000953.html"" rel=""noreferrer"">You're Probably Storing Passwords Incorrectly</a><br> Introductory article about storing passwords</li> <li><a href=""http://news.ycombinator.com/item?id=55660"" rel=""noreferrer"">Discussion: Coding Horror: You're Probably Storing Passwords Incorrectly</a><br> Forum discussion about a Coding Horror article.</li> <li>Never store passwords in a database!<br> Another warning about storing passwords in the database.</li> <li><a href=""http://en.wikipedia.org/wiki/Password_cracking"" rel=""noreferrer"">Password cracking</a><br> Wikipedia article on weaknesses of several password hashing schemes.</li> <li><a href=""http://www.securityfocus.com/blogs/262"" rel=""noreferrer"">Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes</a><br> Discussion about rainbow tables and how to defend against them, and against other threads. Includes extensive discussion.</li> </ul>",,5,2008-10-01 00:39:41.137,2008-08-02 20:40:45.533,0,2018-03-07 06:35:31.907,2018-03-07 06:35:31.907,hasen j,811240,136,549,2,390,,,0
629,0,0,"<p>I found <a href=""http://raulparolari.com/Ruby2/lambda_Proc"" rel=""nofollow noreferrer"">this page</a> which shows what the difference between <code>Proc.new</code> and <code>lambda</code> are. According to the page, the only difference is that a lambda is strict about the number of arguments it accepts, whereas <code>Proc.new</code> converts missing arguments to <code>nil</code>. Here is an example IRB session illustrating the difference:</p> <pre>irb(main):001:0> l = lambda { |x, y| x + y } => #&lt;Proc:0x00007fc605ec0748@(irb):1&gt; irb(main):002:0> p = Proc.new { |x, y| x + y } => #&lt;Proc:0x00007fc605ea8698@(irb):2&gt; irb(main):003:0> l.call ""hello"", ""world"" => ""helloworld"" irb(main):004:0> p.call ""hello"", ""world"" => ""helloworld"" irb(main):005:0> l.call ""hello"" ArgumentError: wrong number of arguments (1 for 2)  from (irb):1  from (irb):5:in `call'  from (irb):5  from :0 irb(main):006:0> p.call ""hello"" TypeError: can't convert nil into String  from (irb):2:in `+'  from (irb):2  from (irb):6:in `call'  from (irb):6  from :0</pre> <p>The page also recommends using lambda unless you specifically want the error tolerant behavior. I agree with this sentiment. Using a lambda seems a tad more concise, and with such an insignificant difference, it seems the better choice in the average situation.</p> <p>As for Ruby 1.9, sorry, I haven't looked into 1.9 yet, but I don't imagine they would change it all that much (don't take my word for it though, it seems you have heard of some changes, so I am probably wrong there).</p>",,2,,2008-08-03 07:28:54.070,0,2016-10-21 18:36:33.597,2016-10-21 18:36:33.597,,3257186,122,626,2,41,,,0
655,0,0,"<p>With the ""Built in"" stuff, you can't, as using 1.0.* or 1.0.0.* will replace the revision and build numbers with a coded date/timestamp, which is usually also a good way. </p> <p>For more info, see the <a href=""http://msdn2.microsoft.com/en-us/library/c405shex(vs.80).aspx"" rel=""noreferrer"">Assembly Linker</a> Documentation in the /v tag.</p> <p>As for automatically incrementing numbers, use the AssemblyInfo Task:</p> <p><a href=""http://code.msdn.microsoft.com/AssemblyInfoTaskvers"" rel=""noreferrer"">AssemblyInfo Task</a></p> <p>This can be configured to automatically increment the build number.</p> <p>There are 2 Gotchas:</p> <ol> <li>Each of the 4 numbers in the Version string is limited to 65535. This is a Windows Limitation and unlikely to get fixed. <ul> <li><a href=""http://blogs.msdn.com/msbuild/archive/2007/01/03/why-are-build-numbers-limited-to-65535.aspx"" rel=""noreferrer"">Why are build numbers limited to 65535?</a></li> </ul></li> <li>Using with with Subversion requires a small change: <ul> <li><a href=""http://www.andrewconnell.com/blog/archive/2006/08/29/4078.aspx"" rel=""noreferrer"">Using MSBuild to generate assembly version info at build time (including SubVersion fix)</a></li> </ul></li> </ol> <p>Retrieving the Version number is then quite easy:</p> <pre><code>Version v = Assembly.GetExecutingAssembly().GetName().Version; string About = string.Format(CultureInfo.InvariantCulture, @""YourApp Version {0}.{1}.{2} (r{3})"", v.Major, v.Minor, v.Build, v.Revision); </code></pre> <hr> <p>And, to clarify: In .net or at least in C#, the build is actually the THIRD number, not the fourth one as some people (for example Delphi Developers who are used to Major.Minor.Release.Build) might expect.</p> <p>In .net, it's Major.Minor.Build.Revision.</p>",,6,,2008-08-03 11:41:38.490,0,2013-01-28 04:10:16.497,2013-01-28 04:10:16.497,user1873471,0,91,650,2,87,,,0


There are references to other people, that looks like someone's personal blog url there as well. This is identifiable data. It might be low in risk and confidentiality, but that's an assessment for later.