Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize indentation of tcl files #292

Merged
merged 3 commits into from
Jan 11, 2022

Conversation

JelteF
Copy link
Contributor

@JelteF JelteF commented Dec 13, 2021

The tcl files in this repo did not seem to adhere to any common
indentation style. This made it quite hard for me to understand what the
code was doing. To resolve this I used the reformat.tcl script that is
provided on the tcl wiki:
https://wiki.tcl-lang.org/page/Reformatting+Tcl+code+indentation

I replaced the original reformat proc that was provided in the first
message, with the reformat2 proc that was contributed by aplsimple in
the last message on the thread. I did make one final change to this
reformat2 script, such that does not exclude comments from indentation.

I think this makes the code much more readable.

I understand that this is a big PR, but it only changes whitespace. If
you look at the diff by excluding whitespace, there's only one change.
And this change is a change from file encoding from latin1 to utf-8.

To look at the diff with whitespace only, simply look at the diff in github
and add ?w=1 to the URL bar of your browser.

@JelteF JelteF force-pushed the auto-indent-files branch 3 times, most recently from 762927f to d8871a8 Compare December 13, 2021 13:33
@sm-shaw
Copy link
Contributor

sm-shaw commented Dec 13, 2021

Hi, to verify, has this PR been fully tested so that the reformatting does not inadvertently impact functionality?

A previous change introduced indentation, however this had to be reversed as it stopped some functionality working 4487e72.

HammerDB is written in TCL but then some commands also load TCL build and driver scripts, so there is TCL embedded in TCL and then SQL and stored procedures embedded in these scripts that may also be dependent on formatting. Also, to prevent having many different scripts for different options HammerDB may edit the scripts inline before loading them and where the edits go is based on position and pattern matching. Therefore, all the options below such as Use All Warehouses, Time profiling and asynchronous scaling (and combinations of these) would need to be tested (with GUI, CLI and WS) to ensure that the formatting before the inline editing did not impact their loading. eg by checking the following options:

driveroptions

If you can confirm what testing has been done, then we can take a look.

@JelteF
Copy link
Contributor Author

JelteF commented Dec 13, 2021

I tried with Postgres from the CLI, with combination "All warehouses" and "Time profiling" both turned on (asynchronous scaling is turned off).

@JelteF
Copy link
Contributor Author

JelteF commented Dec 13, 2021

Could you share a link to (some of) this position and pattern matching code? I would like to see it so I can more confidently say that these changes should not impact these patterns.

@sm-shaw
Copy link
Contributor

sm-shaw commented Dec 13, 2021

Yes, some of this is in the shared functions

https://github.com/TPC-Council/HammerDB/blob/master/src/generic/gentpcc.tcl

and others directly in the database file being modified, such as pgoltp.tcl starting here for adding the connect pooling option:

proc insert_pgconnectpool_drivescript { testtype timedtype } {

I will also take a look and test.

@JelteF
Copy link
Contributor Author

JelteF commented Dec 15, 2021

Looking at the commit that you shared where you undid indentation to fix an issue I think I found the reason why that was necessary: The original indentation also indented the multiline string containing:

#EDITABLE OPTIONS##################################################
...

So any pattern matching on this string would then be impacted.

In this PR multiline strings are not indented, specifically for reasons like this where the extra whitespace matters. To quote the author of the reformat2 proc ath the end of the thread I linked above:

  • it doesn't format the multi-line strings: the initial spaces in continuation strings may matter;

So this PR make sure all code continues working.

For completeness, this is the script I used to reformat:

proc reformat {tclcode {pad 2}} {

  set lines [split $tclcode \n]
  set out ""
  set nquot 0   ;# count of quotes
  set ncont 0   ;# count of continued strings
  set line [lindex $lines 0]
  set indent [expr {([string length $line]-[string length [string trimleft $line \ \t]])/$pad}]
  set padst [string repeat " " $pad]
  foreach orig $lines {
    incr lineindex
    if {$lineindex>1} {append out \n}
    set newline [string trimleft $orig]
    if {$newline==""} continue
    set is_quoted $nquot
    set is_continued $ncont
    if {[string index $orig end] eq "\\"} {
      incr ncont
    } else {
      set ncont 0
    }
    # if { [string index $newline 0]=="#" } {
    #   set line $orig   ;# don't touch comments
    # } else {
      set npad [expr {$indent * $pad}]
      set line [string repeat $padst $indent]$newline
      set i [set ns [set nl [set nr [set body 0]]]]
      for {set n [string length $newline]} {$i<$n} {incr i} {
        set ch [string index $newline $i]
        if {$ch=="\\"} {
          set ns [expr {[incr ns] % 2}]
        } elseif {!$ns} {
          if {$ch=="\""} {
            set nquot [expr {[incr nquot] % 2}]
          } elseif {!$nquot} {
            switch $ch {
              "\{" {
                if {[string range $newline $i $i+2]=="\{\"\}"} {
                  incr i 2  ;# quote in braces - correct (though tricky)
                } else {
                  incr nl
                  set body -1
                }
              }
              "\}" {
                incr nr
                set body 0
              }
            }
          }
        } else {
          set ns 0
        }
      }
      set nbbraces [expr {$nl - $nr}]
      incr totalbraces $nbbraces
      if {$totalbraces<0} {
        error "Line $lineindex: unbalanced braces!"
      }
      incr indent $nbbraces
      if {$nbbraces==0} { set nbbraces $body }
      if {$is_quoted || $is_continued} {
        set line $orig     ;# don't touch quoted and continued strings
      } else {
        set np [expr {- $nbbraces * $pad}]
        if {$np>$npad} {   ;# for safety too
          set np $npad
        }
        set line [string range $line $np end]
      }
    # }
    append out $line
  }
  return $out
 }

proc eol {} {
    switch -- $::tcl_platform(platform) {
        windows {return \r\n}
        unix {return \n}
        macintosh {return \r}
        default {error "no such platform: $::tc_platform(platform)"}
    }
}

proc count {string char} {
    set count 0
    while {[set idx [string first $char $string]]>=0} {
        set backslashes 0
        set nidx $idx
        while {[string equal [string index $string [incr nidx -1]] \\]} {
            incr backslashes
        }
        if {$backslashes % 2 == 0} {
            incr count
        }
        set string [string range $string [incr idx] end]
    }
    return $count
}

set usage "reformat.tcl ?-indent number? filename"

if {[llength $argv]!=0} {
    if {[lindex $argv 0] eq "-indent"} {
        set indent [lindex $argv 1]
        set argv [lrange $argv 2 end]
    } else  {
        set indent 4
    }
    if {[llength $argv]>1} {
        error $usage
    }
    set f [open $argv r]
    set data [read $f]
    close $f
    set permissions [file attributes $argv -permissions]

    set filename "$argv.tmp"
    set f [open $filename  w]

    puts -nonewline $f [reformat [string map [list [eol] \n] $data] $indent]
    close $f
    file copy -force $filename  $argv
    file delete -force $filename
    file attributes $argv -permissions $permissions
}

@JelteF
Copy link
Contributor Author

JelteF commented Dec 15, 2021

I also have another PR that adds optional Citus support. I'll wait with opening that to this repo until this is merged, because it is built on top of it. I have it opened against this branch on our own repo though. It would be great if you could have a look at that one too: citusdata#2

@sm-shaw
Copy link
Contributor

sm-shaw commented Dec 15, 2021

Many thanks again, the pull requests are very much appreciated. Just to set expectations on timescales, all HammerDB pull requests need to go through the TPC-OSS subcommittee before approval - as a lot of members are going to be out for the next couple of weeks the next meeting will now be 11th Jan, so this will be the earliest that the pull requests can be accepted.

I will commit to having reviewed them by this date so if not this week it will be early Jan and hopefully have recommended any changes if needed, so we can aim for 11th Jan with the subcommittee. I hope this meets requirements for having them included.

@sm-shaw
Copy link
Contributor

sm-shaw commented Jan 5, 2022

Looking at the pull requests to provide recommendations for the next TPC-OSS meeting on 11th Jan.

For the indentation, I've tested on multiple databases, and it doesn't affect the functionality which is good.

However, there is an issue that the formatting is not aware that HammerDB is building scripts that get inserted into a Script Editor and therefore indents these inserted scripts as part of the surrounding procedures. instead anything wrapped by .ed_mainFrame.mainwin.textFrame.left.text fastinsert should not be indented - or should be treated as a separate script in the editor for indentation.

For example, as below the indentation sees the "if" statements in the build script in the script editor as part of the proc build_mssqlstpcc which is not correct as instead these are in the build script that gets loaded into the editor.

proc build_mssqlstpcc {} {
.ed_mainFrame.mainwin.textFrame.left.text fastinsert end "#!/usr/local/bin/tclsh8.6
#LOAD LIBRARIES AND MODULES
set library $library
set version $version
"
.ed_mainFrame.mainwin.textFrame.left.text fastinsert end {if [catch {package require $library $version} message] { error "Failed to load $library - $message" }
if [catch {::tcl::tm::path add modules} ] { error "Failed to find modules directory" }
if [catch {package require tpcccommon} ] { error "Failed to load tpcc common functions" } else { namespace import tpcccommon::* }
proc CreateStoredProcs { odbc imdb } {
puts "CREATING TPCC STORED PROCEDURES"
if { $imdb } {
set sql(1) {CREATE PROCEDURE [dbo].[neword]

Therefore with this PR a driver script appears as follows with the indentation continuing from the procedure that built and inserted it.

indent

This is one of the reasons why this has not been done previously as the formatters are not aware of the procs that build scripts for the editor and also the SQL that is contained within these scripts need separate SQL rather than TCL formatting as well.
So this is definitely desirable to make more readable, but the reformat script used does not produce the right results for the script editor.

Also to note that there is a TCL script in the agent directory as well.

The tcl files in this repo did not seem to adhere to any common
indentation style. This made it quite hard for me to understand what the
code was doing. To resolve this I used the reformat.tcl script that is
provided on the tcl wiki:
https://wiki.tcl-lang.org/page/Reformatting+Tcl+code+indentation

I replaced the original reformat proc that was provided in the first
message, with the reformat2 proc that was contributed by aplsimple in
the last message on the thread. I did make one final change to this
reformat2 script, such that does not exclude comments from indentation.

I think this makes the code much more readable.

I understand that this is a big PR, but it only changes whitespace. If
you look at the diff by excluding whitespace, there's only one change.
And this change is a change from file encoding from latin1 to utf-8.
@JelteF JelteF force-pushed the auto-indent-files branch 3 times, most recently from 70a4f94 to d6f26b8 Compare January 7, 2022 08:58
@JelteF
Copy link
Contributor Author

JelteF commented Jan 7, 2022

I've modified the reformat script from the wiki to know about the "fastinsert end" command. Now indentation restarts at 0 in these blocks. Looking at the code in the script editor now shows indentation that makes sense.

@sm-shaw
Copy link
Contributor

sm-shaw commented Jan 7, 2022

Yes, this looks good now for the formatting and functionality, so recommend to the subcommittee that we accept this PR. Very welcome as this is something that has been put off for a long time.

One minor change needed is that at the moment format.tcl is in the modules directory but is not a module. So it would make sense to make it a module loaded by the CLI and make the formatting an additional CLI command with some additional output to show what is happening when the source file is loaded and formatted.
I will add those changes, so it fits with the overall structure.

if [catch {::tcl::tm::path add modules} ] { error "Failed to find modules directory" }
if [catch {package require tpcccommon} ] { error "Failed to load tpcc common functions" } else { namespace import tpcccommon::* }

For subcommittee review, the following shows the correction in the Script Editor from the earlier image:
foramtsnip

@JelteF
Copy link
Contributor Author

JelteF commented Jan 7, 2022

One minor change needed is that at the moment format.tcl is in the modules directory but is not a module. So it would make sense to make it a module loaded by the CLI and make the formatting an additional CLI command with some additional output to show what is happening when the source file is loaded and formatted.
I will add those changes, so it fits with the overall structure.

Yeah I wasn't sure where to put the file, such that it made sense. But I did want to put it in the repo, so it could be used in the future when necessary. I'll leave changing the exact location and usage to you then.

Btw, I did not run the script for all the module files in the module directory. I thought some of them might be modules that are simply vendored in, so I didn't want to touch touch them because that could introduce conflicts when updating them from upstream in the future. But if there's certain modules that are owned by this repo, then it would make sense to run the reformat script on them too.

@sm-shaw
Copy link
Contributor

sm-shaw commented Jan 11, 2022

Created citusdata#5 to change reformat.tcl into a module

This adds a command to the CLI only called reformat_src used as follows:

hammerdb>reformat_src 4 /home/steve/test.tcl
Reformatting /home/steve/test.tcl with indent 4
Reformat of /home/steve/test.tcl complete

hammerdb>

This command has intentionally not been added to the help index to prevent inadvertent use.
However, having this included with HammerDB enables reformatting aware of HammerDB building and inserting scripts into a script editor.

When citusdata#5 is accepted, this should add the change automatically and #292 will be ready to be accepted.

@sm-shaw
Copy link
Contributor

sm-shaw commented Jan 11, 2022

Merging Pull Request as voted on by TPC-OSS subcommittee on 11th Jan 2022.

@sm-shaw sm-shaw merged commit 9ce93b6 into TPC-Council:master Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants