Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: Cleanup robots.txt & htaccess - Your thoughts please #193

Closed
camya opened this issue Aug 25, 2015 · 16 comments
Closed

Idea: Cleanup robots.txt & htaccess - Your thoughts please #193

camya opened this issue Aug 25, 2015 · 16 comments

Comments

@camya
Copy link

camya commented Aug 25, 2015

Hi

This only a DRAFT... Any suggestions and helping hands are welcome!

At the moment the robots.txt for searchengines contains entries which are now also restricted by .htaccess rules. (See 1: Cleanup robots.txt) We can cleanup the robots.txt in my opinion.

We should also restrict access to some files using the htaccess like composer files, readmes, ... (See 2: Add FilesMatch to main .htaccess) - I created a first version of a FilesMatch.

Also we can remove some unneeded .htaccess files (See 3: Current Deny/Allow structure in htmly via .htaccess)

Please add you suggestions and thoughts.
#1: Cleanup robots.txt

This is only for Searchengines. Remove all entries from robots.txt except the User-agent: *

User-agent: *

Searchengines than can index all files which are not restricted by any htaccess rule.
#2: Add FilesMatch to main .htaccess (in the main directory only)

This is for security. Deny access for various files on the webserver. These FilesMatches are applied also to all subfolders.

There are more candidates for these matches. Feel free to add your ideas.

# deny file access: generic files (?i: ignores the case)
<FilesMatch (?i:(composer\.(json|lock|phar)|(readme|license|copyright)\.(md|txt)))>
    Deny from all
</FilesMatch>

# deny file access: htmly framework files
<FilesMatch (?i:(humans\.txt|\.updateignore))>
    Deny from all
</FilesMatch>

The above FilesMatches will deny the access for the following files. (They'll also ignore the case of the names, eg. CopyRIGHT.TxT will match too.)

  • humans.txt
  • .updateignore
  • composer.json
  • composer.lock
  • composer.phar
  • readme.txt
  • readme.md
  • license.txt
  • license.md
  • copyright.md
  • copyright.txt

Idea: Maybe it's a good idea to remove the humans.txt and add the content to the README file.
#3: Current Deny/Allow structure in htmly via .htaccess

  • /cache/.htaccess [Deny all]
  • /config/.htaccess [Deny all]
  • /content/.htaccess [Deny all]
  • /content/images/.htaccess [Allow only jpg|png|gif]
  • /system/.htaccess [Deny all]
  • /system/admin/editor/.htaccess [Allow all]
  • /system/admin/resources/.htaccess [Allow all]
  • /themes/.htaccess [Deny all]
  • /themes/[THEME]/css/.htaccess [Allow all]
  • /themes/[THEME]/fonts/.htaccess [Allow all]
  • /themes/[THEME]/images/.htaccess [Allow all]
  • /themes/[THEME]/img/.htaccess [Allow all]
  • /themes/[THEME]/js/.htaccess [Allow all]

The following .htaccess files can be removed in my opinion, because /system/ already sets the "Deny all".

  • /system/admin/views/.htaccess
  • /system/admin/includes/.htaccess
  • /system/admin/vendor/.htaccess

Please add you suggestions and thoughts.

This is a followup to the pull request: #192

@Kanti
Copy link
Collaborator

Kanti commented Aug 25, 2015

The humans.txt can be reached by intention.

Read more here: http://humanstxt.org/

We should not remove humans.txt

@danpros
Copy link
Owner

danpros commented Aug 26, 2015

Just adding category feature to the core. Please see this or test it #194

@camya
Copy link
Author

camya commented Aug 26, 2015

@Kanti - I'll than remove the humans.txt from the FilesMatch.

The FilesMatch now looks like this.

# deny file access: htmly framework files
<FilesMatch (?i:(\.updateignore))>
    Deny from all
</FilesMatch>

@camya
Copy link
Author

camya commented Aug 26, 2015

I found 2 candidates for the robots.txt. Login and Api are public accessible, but should't be index by any searchengine.

User-agent: *

Disallow: /login
Disallow: /api

@danpros
Copy link
Owner

danpros commented Aug 26, 2015

Using htaccess to deny almost all of the folder will lead you into trouble. Just upgrading an old installation and the resource blocked because of the htaccess put inside the themes folder.

@danpros
Copy link
Owner

danpros commented Aug 26, 2015

Seems we must revert it first before we test it properly in any server environments

@camya
Copy link
Author

camya commented Aug 26, 2015

Hi Dan.

I just took a look a the content of the theme folders shipped with htmly-2.6.1 by default.

For me it looks like the theme folders only contain php files not directly accessible but used by the framework. (Like post.html.php or main.html.php for example) The webserver should not access these files.

The only files the webserver need to access directly are located inside the themes css, fonts, images (img) and js folder. These are allowed by the .htaccess files.

I guess the problem is, that users added additional scripts into the theme folder. Than the htaccess indeed block the access for them too.

It's ideed better to remove the htaccess files from the themes folder to avoid problems.

Instead we than need to add some kind of if ("Framework loaded") condition to all framework files within the themes folder. (Wordpress do it the same way)

Each (framework) php file within the theme folder will start with this condition:

<?php if (!defined('HTMLY')) exit(); ?>

In the index.php in our doc root we add the line:

<?php define('HTMLY', true); ?> 

What do you think?

@camya
Copy link
Author

camya commented Aug 26, 2015

@danpros Your old installation used the default theme? It looks like I missed to add a htaccess file inside the css folder of the /theme/default/. Maybe this was the problem?

@danpros
Copy link
Owner

danpros commented Aug 27, 2015

@greenphp many user creating their own theme so we should not put htaccess file inside the theme folder.

Sorry currently I am working on new release version, will release it very soon, perhaps in a few minutes so my current goal is the content migration work as expected. After this release we can improve htmly without any limit again (in term of content creation).

@camya
Copy link
Author

camya commented Aug 27, 2015

Fine. I'll wait for the new release than.

@danpros
Copy link
Owner

danpros commented Aug 27, 2015

@greenphp released! 😄

Please make sure use PHP 5.3, eg.:

$posts = array();

Instead of

$posts = [];

@camya
Copy link
Author

camya commented Aug 27, 2015

Great, I've already updated to the new version. Congratulations.

I will avoid the [] syntax in my future commits.

@Kanti
Copy link
Collaborator

Kanti commented Aug 27, 2015

@danpros Should we really have PHP5.3 support?
It gets no more security patch. The last security patch was 1 Year ago.

@danpros
Copy link
Owner

danpros commented Aug 27, 2015

@greenphp thanks, the only changes with old theme is to call the related post.

@Kanti we should not dropped PHP 5.3 yet, since many popular OS version still use it as the default repo, as far I know eg. CentOS 6.

@camya
Copy link
Author

camya commented Aug 27, 2015

I've added a pull request #196 for the defined('HTMLY') conditions within the theme template files.
Any feedback is welcome.

About the htaccess structure:

We should also rethink the .htaccess structure and test it.

The folders listed below are the candidates. All of them are "framework" folders. Users normally shouldn't put scripts or assets inside these framework folders. Am I right? Also nobody should direct access the files inside these folders using the web browser. (Except /system/admin/editor/ and /system/admin/resources/)

  • /cache/.htaccess [Deny all]
  • /config/.htaccess [Deny all]
  • /content/.htaccess [Deny all]
  • /content/images/.htaccess [Allow only jpg|jpeg|png|gif]
  • /system/.htaccess [Deny all]
  • /system/admin/editor/.htaccess [Allow all]
  • /system/admin/resources/.htaccess [Allow all]

The theme folder won't contain a htaccess to avoid problems with scripts added by the users. #196 adds a condition to the framework php files within the theme folders to avoid direct access to them.

The only problem could be the "/system/plugin" folder. Are there already plugins for HTMLY?

Should I commit the above listed htaccess files again? Than we can test them.

@danpros
Copy link
Owner

danpros commented Jan 7, 2024

This issue is too old, I will close this one. Please create new issue for possible improvements. Thanks

@danpros danpros closed this as completed Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants