Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment repositories are larger than expected #1483

Closed
grahl opened this Issue May 7, 2017 · 12 comments

Comments

Projects
None yet
2 participants
@grahl
Copy link
Contributor

grahl commented May 7, 2017

Hi

My system information:

  • Operating system type: macOS & Linux
  • BLT version: 8.8.3

I want to execute blt deploy from my GitlabCI job. This works generally fine, by calling, for example:

blt deploy -Ddeploy.branch="master" -Ddeploy.commitMsg="Release $CI_COMMIT_REF_NAME"

However, when I specify it as follows with a tag, the branch does not get pushed, only the tag:

blt deploy -Ddeploy.branch="master" -Ddeploy.tag="$CI_COMMIT_REF_NAME" -Ddeploy.commitMsg="Release $CI_COMMIT_REF_NAME"

I looked at issue #830 and while that might be related towards the tag support but did not find any applicable hints.

I suppose I could call blt deploy and blt deploy:tag separately? However, I want to avoid building the site twice.

Thanks for any information on whether this might be a bug or a misunderstanding on my part.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented May 8, 2017

@grahl This is currently working as designed, though I'd be open to changing it. See https://github.com/acquia/blt/blob/8.8.3/phing/tasks/deploy.xml#L91

At present, you can either build and push a tag, or build and push a branch, but not both. I know that we had a good reason for doing this... though I can't remember exactly what problem was caused by doing both at once.

I'm going to consider this a feature request.

@grasmash grasmash added the Enhancement label May 8, 2017

@grahl

This comment has been minimized.

Copy link
Contributor Author

grahl commented Jul 18, 2017

Thanks for your feedback grasmash, so, is the recommend usage to running a tagged release to directly check out the tag and run in detached head? We usually run on master and use tags for convenience only.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Aug 2, 2017

Yes, that is the recommended usage. That allows you to have corresponding tags on the source repository and the artifact branch.

@grahl

This comment has been minimized.

Copy link
Contributor Author

grahl commented Aug 7, 2017

Thanks again for your feedback.

However, I noticed something peculiar: The repository size (at least as reported by Gitlab) doubled immediately once blt deploy pushed the tag. With additional commits the repo size did not increase by the size of the data again.

My assumption (which could be wrong) is that git is smart enough to see that commit references following each other on HEAD are clearly related but it has problems if they differ significantly.

I'm worried that this would be an issue with multiple parallel release workflows, consider the following project: A master branch produces three bugfix release 1.3.1, 1.3.2 and 1.3.3 in the same week as the testing branch produces two releases 2.0.0-rc1 and 2.0.0-rc2 between the bugfix releases. My assumption would be that with blt deploy tag the build repo would have trouble easily managing the diffs between these releases and would cause significant overhead if these different development streams were not on distinct branches. The addition of a branch to deploy tag should not cause a regression in behavior from my perspective since one could still keep checking out tags directly.

Thanks for your review of this potential issue.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Sep 27, 2017

The reason that the repository increased in size so much after the first deployment is that the artifact contains many more files than the source. For instance, none of the contributed modules or vendored packages are committed to the source repository.

After the initial deployment, I don't see repository sizes being prohibitive.

@grasmash grasmash closed this Sep 27, 2017

@grahl

This comment has been minimized.

Copy link
Contributor Author

grahl commented Sep 27, 2017

Hi

I think your conclusion is incorrect and the issue is not resolved. I am seeing this in an empty deployment repository unassociated with the project repository.

Specifically, I have a project here where the extracted files take up 211MB on disk while they take up 780MB in Gitlab. This is a project with only 8 commits listed (one for each commit on a branch before I switched blt deploy from branch to tag) and 20 tag deployments. The changes between those are small and definitely not multiples of the project itself and thus the repository size should be below 300MB.

I can also now reproduce this locally:

  1. Add a directory as the remote in project.yml, e.g. /Users/user/deployment-test
  2. cd /Users/user/deployment-test && git init
  3. blt deploy --tag 0.0.1
  4. du -hs deployment-test: 39M
  5. blt deploy --tag 0.0.2 (exact same codebase)
  6. du -hs deployment-test: 79M

@grasmash grasmash reopened this Sep 27, 2017

@grasmash grasmash changed the title Inconsistent behavior with Ddeploy.tag Deployment repositories are larger than expected Sep 28, 2017

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Sep 28, 2017

Thanks @grahl, I have reopened the issue.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Sep 28, 2017

I sort of reproduced this locally.

I ran two deployments. After the first deployment, the deployment repository size was 220mb. After the second deployment it was 270. This isn't a doubling of size, but 50 MB is way more than I would expect.

git diff 0.0.1 0.0.2 shows only:

diff --git a/deployment_identifier b/deployment_identifier
index 8acdd82b7..4e379d2bf 100644
--- a/deployment_identifier
+++ b/deployment_identifier
@@ -1 +1 @@
-0.0.1
+0.0.2
diff --git a/vendor/autoload.php b/vendor/autoload.php
index 610554915..21fee23cd 100644
--- a/vendor/autoload.php
+++ b/vendor/autoload.php
@@ -4,4 +4,4 @@
 
 require_once __DIR__ . '/composer/autoload_real.php';
 
-return ComposerAutoloaderInit6615f461fcda20694f07194b6b6f9a8f::getLoader();
+return ComposerAutoloaderInit3430ed319372762179469f619afacc6a::getLoader();
diff --git a/vendor/composer/autoload_real.php b/vendor/composer/autoload_real.php
index 6abf28b50..aa1a8c15a 100644
--- a/vendor/composer/autoload_real.php
+++ b/vendor/composer/autoload_real.php
@@ -2,7 +2,7 @@
 
 // autoload_real.php @generated by Composer
 
-class ComposerAutoloaderInit6615f461fcda20694f07194b6b6f9a8f
+class ComposerAutoloaderInit3430ed319372762179469f619afacc6a
 {
     private static $loader;
 
@@ -19,15 +19,15 @@ public static function getLoader()
             return self::$loader;
         }
 
-        spl_autoload_register(array('ComposerAutoloaderInit6615f461fcda20694f07194b6b6f9a8f', 'loadClassLoader'), true, true);
+        spl_autoload_register(array('ComposerAutoloaderInit3430ed319372762179469f619afacc6a', 'loadClassLoader'), true, true);
         self::$loader = $loader = new \Composer\Autoload\ClassLoader();
-        spl_autoload_unregister(array('ComposerAutoloaderInit6615f461fcda20694f07194b6b6f9a8f', 'loadClassLoader'));
+        spl_autoload_unregister(array('ComposerAutoloaderInit3430ed319372762179469f619afacc6a', 'loadClassLoader'));
 
         $useStaticLoader = PHP_VERSION_ID >= 50600 && !defined('HHVM_VERSION') && (!function_exists('zend_loader_file_encoded') || !zend_loader_file_encoded());
         if ($useStaticLoader) {
             require_once __DIR__ . '/autoload_static.php';
 
-            call_user_func(\Composer\Autoload\ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::getInitializer($loader));
+            call_user_func(\Composer\Autoload\ComposerStaticInit3430ed319372762179469f619afacc6a::getInitializer($loader));
         } else {
             $map = require __DIR__ . '/autoload_namespaces.php';
             foreach ($map as $namespace => $path) {
@@ -48,19 +48,19 @@ public static function getLoader()
         $loader->register(true);
 
         if ($useStaticLoader) {
-            $includeFiles = Composer\Autoload\ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::$files;
+            $includeFiles = Composer\Autoload\ComposerStaticInit3430ed319372762179469f619afacc6a::$files;
         } else {
             $includeFiles = require __DIR__ . '/autoload_files.php';
         }
         foreach ($includeFiles as $fileIdentifier => $file) {
-            composerRequire6615f461fcda20694f07194b6b6f9a8f($fileIdentifier, $file);
+            composerRequire3430ed319372762179469f619afacc6a($fileIdentifier, $file);
         }
 
         return $loader;
     }
 }
 
-function composerRequire6615f461fcda20694f07194b6b6f9a8f($fileIdentifier, $file)
+function composerRequire3430ed319372762179469f619afacc6a($fileIdentifier, $file)
 {
     if (empty($GLOBALS['__composer_autoload_files'][$fileIdentifier])) {
         require $file;
diff --git a/vendor/composer/autoload_static.php b/vendor/composer/autoload_static.php
index b3649a9da..bb1ef35f7 100644
--- a/vendor/composer/autoload_static.php
+++ b/vendor/composer/autoload_static.php
@@ -4,7 +4,7 @@
 
 namespace Composer\Autoload;
 
-class ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f
+class ComposerStaticInit3430ed319372762179469f619afacc6a
 {
     public static $files = array (
         '0e6d7bf4a5811bfa5cf40c5ccd6fae6a' => __DIR__ . '/..' . '/symfony/polyfill-mbstring/bootstrap.php',
@@ -6858,10 +6858,10 @@ class ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f
     public static function getInitializer(ClassLoader $loader)
     {
         return \Closure::bind(function () use ($loader) {
-            $loader->prefixLengthsPsr4 = ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::$prefixLengthsPsr4;
-            $loader->prefixDirsPsr4 = ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::$prefixDirsPsr4;
-            $loader->prefixesPsr0 = ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::$prefixesPsr0;
-            $loader->classMap = ComposerStaticInit6615f461fcda20694f07194b6b6f9a8f::$classMap;
+            $loader->prefixLengthsPsr4 = ComposerStaticInit3430ed319372762179469f619afacc6a::$prefixLengthsPsr4;
+            $loader->prefixDirsPsr4 = ComposerStaticInit3430ed319372762179469f619afacc6a::$prefixDirsPsr4;
+            $loader->prefixesPsr0 = ComposerStaticInit3430ed319372762179469f619afacc6a::$prefixesPsr0;
+            $loader->classMap = ComposerStaticInit3430ed319372762179469f619afacc6a::$classMap;
 
         }, null, ClassLoader::class);
     }

These are expected changes, even though the source repository didn't change. Composer re-dumps the autoloader with unique hashes.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Sep 28, 2017

I found that after running git gc the deployment repo became 234mb. This can be repeated multiple times:

  • create deployment
  • deployment repo becomes 270mb
  • run git gc
  • deployment repo becomes 234mb

So, after garbage collection, the repository does not appear to grow significantly in size as a result of repeated deployments.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Sep 28, 2017

I'm not sure whether this should be considered a BLT issue. It does in fact appear to be related to the way they git stores information, and executing git gc does appear to resolve the problem.

Apparently, this garbage cleanup happens automatically on most repositories, cleaning up references older than two weeks old that are no longer in use. See https://git-scm.com/docs/git-gc.

In that sense, this may resolve itself for most repositories.

Admittedly, I do not know exactly what about the deployment command causes the generation of
excess data (which is subsequently cleared out by garbage collection) or exactly what excess data is generated.

@grasmash

This comment has been minimized.

Copy link
Collaborator

grasmash commented Oct 2, 2017

I am closing this issue because:

  1. There appears to be an available workaround.
  2. I do not believe that there is any actionable work. It is unclear whether BLT can do anything about this.

Please file a new issue if they work around is not sufficient.

@grasmash grasmash closed this Oct 2, 2017

@grahl

This comment has been minimized.

Copy link
Contributor Author

grahl commented Oct 10, 2017

Thanks for your feedback, I'm still seeing problems with this but those seem to be Gitlab-specific issues in terms of the garbage collection not working consistently. Will follow recommendation to file a new issue if/when I track this down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.