Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please make possible to disable the default Joomla behavior of creating sessions for 'guest users' in the db #8772

Closed
creativeprogramming opened this issue Dec 23, 2015 · 15 comments

Comments

@creativeprogramming
Copy link

commented Dec 23, 2015

Joomla creates a JSession - so an entry in #_session table - not just for logged in users, but also for guests (site anonymous visitors, bots included). For a site with session lifetime set to 30 minutes or more and a bit of traffic is easy to have #_session table populated with thousands of entries and get performance issues due to the high number of concurrent database writes per second.

I think this is a point to fix in 2015/2016, please make possible for site admins to prevent Joomla from creating a session for guest users.

I think I know WHY joomla is creating that sessions for anonymous visitors:

  • some unuseful reasons:
    • count online visitors in modules like 'we have xxx guests online' (unuseful '90s feature, get rid of it please)
    • et similia
  • some serious reasons:
    • make the JFactory->enqueueMessage() persistent across multiple redirects during form submit errors et similia
    • to save cart, order data in e-commerce systems and similar, also for guests users before they register/login (but I think this behavior should be invoked by the software that needs it not the core joomla)
    • to store tokens for some anti-spam captchas (I'm not sure), anyway I think also this should be invoked only when needed

What i propose as draft/idea for a solution is the following points:

  1. In global joomla configuration.php split the $lifetime (session duration) configuration in two different settings: one for guests and one for logged in users (so you can set shorter session duration for guests)
  2. In global joomla configuration.php split the $session_handler (session storage) configuration in two different settings: one for guests and one for logged in users (so you can guest session storage to use in-memory storages memcached/redis instead of database where available, and the core joomla code should be smarter and ensure to not use the db at all in that case, nor for the session ids)
  3. In Global joomla configuration.php add an option to disable guest session creation at all (but make this behavior overridable by an extension that needs it, e.g. an e-commerce order/cart page can invoke something with the API to enable it, e.g. JSession::startGuestSession(); before rendering the form)
  4. Use a different session cookie name / session_id for the guest sessions (this make possible point 1 and 2, and prevents session hash collisions, and some issues i had using nginx cache, varnish or apache traffic server in front of joomla, as I'd guest session set-cookie ids cached by the caching layer, but then the guest logged in and become a logged in user (but the session id didn't changed), e.g. even an user with editor permission in frontend, before sign in can/must visit a public (cacheable) page of the site (the login page too), and after that login his session id don't changes, so I had tons of visitors automatically logged in as editor or other users - I manually fixed this with a plugin that changes session_id right after a successful login, but it would be nice to have in the core joomla something that prevents that)

@creativeprogramming creativeprogramming changed the title Please make possible to disable the Joomla behaivor of creating sessions for 'guest' in the db Please make possible to disable the Joomla behaivor of creating sessions for 'guest users' in the db Dec 23, 2015

@creativeprogramming creativeprogramming changed the title Please make possible to disable the Joomla behaivor of creating sessions for 'guest users' in the db Please make possible to disable the default Joomla behaivor of creating sessions for 'guest users' in the db Dec 23, 2015

@creativeprogramming creativeprogramming changed the title Please make possible to disable the default Joomla behaivor of creating sessions for 'guest users' in the db Please make possible to disable the default Joomla behavior of creating sessions for 'guest users' in the db Dec 23, 2015

@brianteeman

This comment has been minimized.

Copy link
Contributor

commented Dec 23, 2015

You dont have to use the database for sessions. The option exists to use native php as well


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/8772.

@brianteeman

This comment has been minimized.

Copy link
Contributor

commented Dec 23, 2015

Scrub that - i forgot that there is still a database entry even then - just storing less data


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/8772.

@mbabker

This comment has been minimized.

Copy link
Member

commented Dec 23, 2015

The #__session table is a multi-purpose database table (sadly) in that:

  1. It is the data store for the database session handler (session_id, time, data columns), this type of data is logged in some form by all session handlers (generally the session ID is used as some kind of identifier, either as the primary key in the database table or the filename for PHP's default filesystem sessions and the data is self explanatory; the time value is handled differently by the handlers based on their requirements)
  2. It is a data store for application metadata (client_id, guest, userid, username columns); because of the hardcoded instructions in the application classes, every row but the data column gets written to the session table to support the use of this metadata

For what it is worth, it is only through the storage of this additional metadata that Joomla users are able to differentiate between sessions for guests and authenticated users and which application those sessions are "assigned" to. Sessions in PHP do not differentiate between guests and authenticated users, they are treated as one in the same and since PHP itself is a stateless code base (the only data that persists between requests is basically assigned as a cookie to the user's browser or recorded in the session data store, or a developer is using something of their own accord like writing data to the filesystem), it boils down to the integrity of the session data store and the application's security measures to keep a user logged in.

Even with changes I've experimented with that defers explicitly starting the session as part of the application's constructor, Joomla's internal dependencies will cause a session to start near the application's doExecute methods. Specifically, the linked lines in JApplicationSite::initialiseApp(), JApplicationAdministrator::doExecute(), and InstallationApplicationWeb::initialiseApp() are where the session is being started by Joomla with my deferred session startup experiment.

A handful of places in the code where end user features are made available through this additional metadata or where this metadata is manipulated outside the handling of a "normal" session:

  • Admin mod_logged module uses the session metadata to show the logged in users, which application they are logged into, and to log the user out (this last one only works if you're using the database session store)
  • Admin mod_status module uses the session metadata to count the number of logged in users per application
  • Site mod_whosonline module uses the session metadata to count the number of frontend users
  • JTable::isCheckedOut() uses the session metadata to determine if a record that has a value for its checked_out column is checked out to a user with an active session
  • The Joomla user plugin has three event listeners (onUserAfterDelete, onUserLogin, onUserLogout) that manipulate the session table's records

Long and short, Joomla should not know nor care if the active session is assigned to a guest or an authenticated user; that distinction is made in the application in other ways surrounding how the user object is persisted in the session (this is why if you're checking if a user is authenticated or a guest you're checking the JUser object, not an API endpoint in JSession). So any API or configuration changes to make distinctions of the two types of sessions is needless complexity IMO and truthfully something I'm not sure could be handled very effectively given PHP's stateless nature and how its internals handle sessions.

@creativeprogramming

This comment has been minimized.

Copy link
Author

commented Dec 24, 2015

What I wanted to say is simply: are you sure that we really need to start sessions for any guest user? In my classic PHP apps, a 'guest user' is a user with 'no session', yes, this is even more 'stateless'.

The splitting I proposed is just an example of a refactoring you can surely do, to permit some site admins to disable the 'guest user sessions' if they don't need them (e.g. I don't need them at all, and I disabled them with the 'hack' of caching, I've no drawbacks so i think them are not so useful, not for sites that just show news to the public, for instance)

Anyway I recognize, it's not an urgent issue, but please don't underestimate the cost of a db write done for any new visitor, it can be a bottleneck when you have lots of new visitors at once (e.g. an article going viral). Is wordpress doing something similar with sessions? I think not.

@mbabker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2015

The way Joomla's coded, yes, a session will be started for every visitor to the site and I pointed where in each application's sequencing where the first call to something that would trigger a lazy loaded session happens. It's long before Joomla's first plugin event happens, so even with Joomla's caching layer enabled, the session API would get booted up. From a technical perspective, you couldn't distinguish between an authenticated user with a pre-existing session and a guest user first visiting the site until after the session is started in the app and the serialized contents extracted. At best, you might be able to add a configuration switch that asks to destroy sessions for unauthenticated users at the end of the application cycle so that sessions aren't persisted, but even with that toggle as long as the non-required metadata is logged in the #__session database table it would truthfully only make matters worse (yet another database operation for data that won't even persist).

It is entirely possible to decouple session handling from the database. Simply get rid of all those features I mentioned previously which log extra metadata in the database. That is what truly forces the DB interaction on every request, even moreso than the default use of the database session handler.

@creativeprogramming

This comment has been minimized.

Copy link
Author

commented Dec 24, 2015

@brianteeman yes, the #_session table is INSERTED with new rows even when 'files' is used as session handler, i was talking also about this,i don't like this behavior, cause it's always a write to disk/mysql connection, even if with less data, it has a cost (consider also the RDBMS response time, that is blocking).

So I ask: are you sure Joomla needs this?

why?

I have the suspect that the origin of this choice are of the times of 'Mambo' and the "who's online module".

Are you sure that this is really needed by Joomla 3? Are you sure that JSession doesn't need a strong review nowadays?

There's a lot of dead code in it, for instance check the 'fix_browser' and 'fix_adres' code, it's not nice to see that commented code since ages ... (anyway i'm not telling to remove it: I use it, as i use it as security check for the session ip to prevent hijacking, you can make a configuration switch also to enable this)

You're doing a great job developing the new features of the CMS and refactoring a lot of classes, but I can't explain myself why to leave this foundament 'untouchable' , this basic and simple part, so important for security and performance: the session.

I mean, please don't say: "It was developed this way, it works, and we don't want to touch it anymore" because I'm not sure it was developed so fine as i suspect is pretty old.

Maybe it's ok, but please be a bit 'paranoid' on this and start some deep study on it... can you optimize it? Is it perfect yet or we can do a bit better?

@creativeprogramming

This comment has been minimized.

Copy link
Author

commented Dec 24, 2015

It is entirely possible to decouple session handling from the database. Simply get rid of all those features I mentioned previously which log extra metadata in the database. That is what truly forces the DB interaction on every request

Yes I'm asking also for this: decoupling the session handling from the database

even moreso than the default use of the database session handler.

this not necesarly if you limit sessions to 'real' logged users

At best, you might be able to add a configuration switch that asks to destroy sessions for unauthenticated users at the end of the application cycle so that sessions aren't persisted, but even with that toggle as long as the non-required metadata is logged in the #__session database table it would truthfully only make matters worse

yes this is worse

from a technical perspective, you couldn't distinguish between an authenticated user with a pre-existing session and a guest user first visiting the site until after the session is started in the app and the serialized contents extracted

true (but I've to test if really a disk read is really needed at session_status()/start(), PHP should cache active session ids in a shared memory object if it was developed fine), but anyway, one thing is the 'cost' of a read, and another is the cost of a read+write, that is what joomla is doing now for any visitor, it's like this (i'm pseudocoding it in old school php):

if (session_status() !== PHP_SESSION_ACTIVE) { //  read (from disk?)
session_start(); //generate session_id and send back to client cookies
$_SESSION["guest"]=1; //disk write (and mysql connection used!)
//...other metadata writes...
}
//i know that the real write will be only at session_write_close(), in the JSession->close() method now, but anyway it will be a write for any incoming request 

while I would do this way

if (session_status() !== PHP_SESSION_ACTIVE) { // read (from disk?)
 //don't generate any session_id 
GlobalAccessibleClass::setIsGuestUser(true); //in memory write
}

a session will be started for every visitor to the site and I pointed where in each application's sequencing where the first call to something that would trigger a lazy loaded session happens. It's long before Joomla's first plugin event happens, so even with Joomla's caching layer enabled,

yes, in fact i dropped the use of the standard Joomla 'page-caching' layer (the system>cache plugin) because it was very low performant due to those writes always invoked anyway: always a 300ms for the first-byte (this why I hate this behavior guys, don't blame me, i spent hours to debug it) for this reason i overcomed the issue using varnish, apache traffic server, or nginx_fastcgi_cache in front of joomla, this means that in my configuration, for a guest user, when chache is hot, not a single line of PHP joomla code is invoked and all works fine (I added an 'x-logged-in' cookie set in an onAfterLogin for logged in users so the reverse proxy can distinguish logged in users and forward the request to Joomla if the cookie is set) but i will not need this complex configuration if Joomla wasn't setting the session 'before' the app initialization for any user

Yes, you can use the same technique too: check for the session cookie presence to determine if the user is not a guest, also in php, in this case surely you'll not have to do any disk read to check if the user is a guest:

if (!hasCookie(session_name())) { //in memory read (a read from request headers that server got in memory) 
GlobalAccessibleClass::setIsGuestUser(true); //in memory write
}
@mbabker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2015

this not necesarly if you limit sessions to 'real' logged users

You can't though without opening up security holes. session_status() is only PHP 5.4+ (so unusable while Joomla supports PHP 5.3) and from what I've gathered from its documentation, it will only tell you if the session has been started, not whether one was persisted from a previous request. So even with that check the session would have to be fully started still to determine if the user is authenticated. Personally, I would not depend on request headers or cookies as a means to determine a user's status, unless those values can only be generated by a system the end user cannot manipulate (so if you're running a proxy server between the end user and your Joomla install).

There is no guest status persisted to $_SESSION in Joomla. JUser::$guest is where the guest property is tracked as the JUser object is persisted as part of the session. All of the metadata I pointed out previously is always stored in the database and aside from the session ID itself, none of it is part of the true session handling for the application.

All this metadata is frankly optional and somewhat annoying add on data for a majority of users. I have found only one condition in Joomla that gets even remotely broken by just dumping it all completely. Of the list I posted earlier, JTable::isCheckedOut() would need updates to deal with its condition to query the session table to determine if a record with a value for its checked_out column matches a user with an active session.

I don't disagree there needs to be work on the infrastructure. But frankly, short term I don't see a way to refactor Joomla to a point where it would be possible to get even to the first plugin event trigger (without adding a new one in the application constructors before JSession is instantiated) without the session being booted.

@creativeprogramming

This comment has been minimized.

Copy link
Author

commented Dec 24, 2015

You can't though without opening up security holes. session_status() is only PHP 5.4+ (so unusable while Joomla supports PHP 5.3) and from what I've gathered from its documentation

Ok, but maybe for Joomla 4?

it will only tell you if the session has been started, not whether one was persisted from a previous request.

and what about this:

if (session_id()=="")

So even with that check the session would have to be fully started still to determine if the user is authenticated.

Yes you will still have to start a session for non-guest users, but you can do it in onAfterLogin event, right before where you currently set the JUser data and metadata.

Personally, I would not depend on request headers or cookies as a means to determine a user's status, unless those values can only be generated by a system the end user cannot manipulate

But this is what php does internally, yes I was shocked when i realized it some years ago, but client cookies are the only 'keys' for the server side sessions (for this reason i use HTTPS on all the sites, to avoid session cookie sniffing).

Anyway I mean: if there's no session cookie, you can trust that the user is surely a guest (also in the current implementation, if you delete the session cookie you become a guest), but of course not that if there's a cookie we can just trust it and consider it as a logged in user, in that case the system should check for the JUser object presence and its values in THAT session, all like is doing now , and this will be always a thing that end user cannot manipulate (except if they guess or obtain the session id, for this reason another improvement i suggest is to use multiple cookies as session ids, not only one, like bigs like Facebook and Google do, just in order to increase the session id 'key length' to make harder the brute forcing)

There is no guest status persisted to $_SESSION in Joomla. JUser::$guest is where the guest property is tracked as the JUser object is persisted as part of the session.

Yes, my pseudocode was an optimistic example, there's more data written i know

All of the metadata I pointed out previously is always stored in the database and aside from the session ID itself, none of it is part of the true session handling for the application.

Yes, and this is not fine don't you agree? It's redundant!

All this metadata is frankly optional and somewhat annoying add on data for a majority of users.

Yes, you agree :)

have found only one condition in Joomla that gets even remotely broken by just dumping it all completely. Of the list I posted earlier, JTable::isCheckedOut() would need updates to deal with its condition to query the session table to determine if a record with a value for its checked_out column matches a user with an active session.

Yes this is useful, for reasons like this I was asking keep this behaivor on, but just for 'real' users, not guests (You'll never have a guest locking an item, so you can simply check if the row is expired)

Anyway you can also get rid of this, and implement an 'expire' for those locks, and an ajax poller that if the user is active, and editing an item refreshes the checked out status extending his lifetime

I don't disagree there needs to be work on the infrastructure. But frankly, short term I don't see a way to refactor Joomla to a point where it would be possible to get even to the first plugin event trigger (without adding a new one in the application constructors before JSession is instantiated) without the session being booted.

Ok I understand that is not a priority and I agree, sorry, i don't want to criticize anything, you're doing a great job: thank you all for all! Joomla is always the best CMS around IMHO, I just wanted to point out this to ensure that you'll not completely forget of this part of its codebase, and now I know you will not.

So, nevermind, maybe I'm just a bit obsessive on this, please forgive me.

Thank you for your time and all the detailed replies
Best Regards

@mbabker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2015

staging...mbabker:refactor-session-metadata-coupling is basically everything needed to split the logic for the extra metadata from the "real" session requirements without dealing with every feature using that metadata.

session_id() and other session functions only return data about the current state of the session, not whether there is a persisted session. Even the $_SESSION superglobal is only available after the session is started.

Yes you have to start session just for non-guest users, but you can do it in onAfterLogin event, right before where you currently set the JUser data and metadata.

That's far too late, and that event only runs when a user actually tries to login. The application itself checks for data from the session literally two or three executable lines of code into when $app->execute() is called. Without refactoring a lot of the application startup sequence there is nothing that can be done to keep JSession::start() from being called before the onAfterInitialise plugin event. The very first line in the site and admin apps initialiseApp methods is a call to JFactory::getUser(), which attempts to load a user object from the session. A mix of user params, browser data, and site configuration is used to boot up the global JLanguage object.

Anyway i mean: if there's no session cookie user you can trust that is surely a guest (also in the current implementation, if you delete the session cookie you become a guest), but of course not that if there's a cookie we can just trust it and consider it as a logged in user, in that case the system should check for the JUser object presence and its values in THAT session, all like is doing now , and this will always be a thing that end user cannot manipulate

While this check may be feasible at the beginning of the application's cycle, you still have to have a way to instruct the API "I want JSession::start() and its internals to run in full even though the config says don't start sessions for guests because my extension's features require session persistence". Considering the number of calls to get data out of the session, this might end up being more taxing than it's worth IMO. It's something that might be worth playing with long term, but on top of adding new config values and checks one would need to make sure enforcing them doesn't hurt the app more than help.

@ggppdk

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2016

Since not creating a JSession will be very difficult, i believe doing 1 and 2 suggestions should be examined

  • only extensions that directly query the session table would have to be updated

Some sites (that have enough RAM) with many active visitors (guests, even bots) have converted 'data' column of session table, to varchar(10000) or varchar(20000) and then converted session table to MEMORY storage (and configured a small session time)

  • so besides examining a memcached/redis solution for guests, we could examine using MEMORY storage engine for a new session table 'session_guests' with a varchar of 20000 for guests (fallback to INNODB is MEMORY is not allowed or if server is not MySQL) . Just to avoid memory problems with SQL server and big session times (30+ or 60+ minutes), a different session time for guests will be needed too
  • when user logins the session data will be stored to the current 'session' table instead of table 'session_guests'
@mbabker

This comment has been minimized.

Copy link
Member

commented Jan 4, 2016

*sigh*

THERE IS NO DISTINCTION AT THE PHP LEVEL BETWEEN A "GUEST" SESSION AND AN AUTHENTICATED SESSION. ADDING ANY CODE TO JOOMLA TO TRY AND MAKE THAT DISTINCTION ONLY FURTHER COUPLES THE APPLICATION TO NEEDING THAT EXTRA METADATA STORED IN THE SESSION TABLE.

@ggppdk

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2016

my comment was just a thought for discussion ))

about php not having a distintion for guest / authenticated,
yes, i forgot about it in my above comments

(maybe) we can solve it

  • by always querying the 'session_guests' table first and then, (if row is missing) query the 'session' table. The extra read query will cost very little since table will be using MEMORY storage
  • or by always having an entry in 'session_guests' to indicate a logged user aka query 'session' table

Also if memory storage is not available the 'session_guests' maybe not create the table at all, joomla configuration should indicate if table exists or not

  • the logic regarding usage of 'session_guests' will only be inside JSession (right?) so no extensions will break if they do not query the 'session' table directly

e.g. J3.4.7+ is using base64 / serializing on session data, and nothing broke because of it,

  • if 3rd party extensions are not accessing session table directly nothing should break (right ?), if something breaks 3rd party extension developers will fix it, particularly if this is done by J4, it will be less problematic

the above are just thoughts, i do not fully know the implications of the changes

e.g. extensions that try to store "large" data for *guests users into session ... will break because of the reduced available space of varchar(20000)

@mbabker

This comment has been minimized.

Copy link
Member

commented Jan 4, 2016

What do we want to do? Keep Joomla dependent on the need to write extraneous and borderline useless metadata somewhere (and have the ability to read it) to give admins some false sense of knowing which users are logged into their site and where and potentially having the ability to forcefully log them out (feature only works if using database session store) or find a way to drop it so the only true dependency on sessions in Joomla is the actual session data? If it's the former, then have fun adding in all these extra options and tables and processes that can't be removed. If it's the latter, I'm interested because frankly I think it's stupid for the application to track this stuff.

@Bakual

This comment has been minimized.

Copy link
Contributor

commented Jan 5, 2016

I'm closing this as I agree with @mbabker and don't see a simple solution to achieve the goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.