-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide how to count 'time to hire' #9
Comments
There is also df_contact$Confirmed_Hired_Date__c |
Good point! Looks like there is a 'hire date' in hiring table which connects to contact via hire$client_name = contact$id |
Hello, There is a time to hire in df_feedback, although not sure if we have a join to other tables. |
We sort of need to define where that time "starts" though too. Our group needs to work with "time in program" which includes "time to hire" as well as the time from "program start" to the time clients leave without being hired. Unless of course there's a "time in program" variable in the data. I haven't seen one though. |
I guess by "We" I mean "our team." ;) |
I did not find time in the program but you have the var: Months_Unemployed__c in SalesForce hire info data set. Maybe useful if you can find when the client joined ..." |
^^ Months unemployed could be the answer we're looking for. With green being the 'actively seeking employment' label for HH< could we also find the difference between 'contact$date_turned_green and hire$start_date? This could be one answer to your question of when to define a client program 'start'- although they could be working within the program while labelled red or purple as well? Not sure on the best route here, anyone have a preference? And while yours directly requires this type of answer @mitchb63, our team could also use it to see if volunteers affect this time-period or if demo has any significant effect on it. |
Maybe we can get an answer to how HH wants us to determine the "start date" during the webinar Tuesday. |
I did some experimenting with this today. "Created_Date" works to a degree but it doesn't seem to be the answer in some cases. Some accounts were apparently created after the fact because I came up with negative "Time in Program" values at times. I also don't think Date turned Green will work for our group because we also need to calculate "Time to date turned Black" which happens before they turn Green. |
I think I may have found something! This is described as "Date of first conversation with Transition Specialist". |
@mitchb63 Great find! I'm not sure how I missed that when I was looking through the data dictionary. Can we use this and date_turned_blue/green/black to help answer those bus questions you think? I'll let you know if I find anything else, but it sounds like it could be useful to make some variables such as : time_to_green/black/blue. That would at least be a start and provide a look at how many days (or whatever interval) it takes for each person to move on in the process. Thoughts? |
I used it to create a "Days in Program" variable that includes all of the end point "colors" It also uses a somewhat arbitrary date of 3/1/2019 as the end date for those that are "currently in the program". Basically, I was just trying to get a very rough idea of what the variables involved in our problems looks like. I created a Word doc with some overview info to help team members get up to speed. |
Just browsed the word and it looks solid! My eda branch is looking to do the same for our teams bus questions but just fleshing it out now. Could you share the code you used to create the days in prog variable pls? I'd like to explore it further as well! 😁 |
No problem. It’s a bit clumsy but it seems to work. When I get back to my computer I’ll send it. Btw, what does EDA stand for? 🤔
…Sent from my iPhone
On Mar 18, 2019, at 2:48 PM, Andrew Trick ***@***.***> wrote:
Just browsed the word and it looks solid! My eda branch is looking to do the same for our teams bus questions but just fleshing it out now.
Could you share the code you used to create the days in prog variable pls? I'd like to explore it further as well! 😁
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Exploratory data analysis. A little vague of a term really. The eda branch is just me exploring the data more through quick visuals in python. Any unexpected or useful findings I’ll be sure to store in a visuals/eda folder within the repo over the next few days.
Thanks!
Andy
… On Mar 18, 2019, at 3:31 PM, mitchb63 ***@***.***> wrote:
No problem. It’s a bit clumsy but it seems to work. When I get back to my computer I’ll send it. Btw, what does EDA stand for? 🤔
Sent from my iPhone
> On Mar 18, 2019, at 2:48 PM, Andrew Trick ***@***.***> wrote:
>
> Just browsed the word and it looks solid! My eda branch is looking to do the same for our teams bus questions but just fleshing it out now.
>
> Could you share the code you used to create the days in prog variable pls? I'd like to explore it further as well! 😁
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Here is the code for what I did this morning. It's an add-on to the cleaner.r file. I haven't taken the time to split it out nicely like you have yet. As I mentioned, there are some questionable choices made in here but I was trying to crank something out to look at before the call tonight. ######################## TEAM2 TOPIC SPECIFIC######################## merge contact and hiredf_hire_contact_join <- left_join(df_contact, df_hire, by = c("Id" = "Client_Name__c")) Reduce variablesdf_topic_edit1 <- df_hire_contact_join[,c(1,4:6,9:11,23:26,47,49,54,66,79,89,96,99,104,117,157,168,179,183,186,187,190,238:241,257,260,261,272,283,292,302,303,305:310,315,318,336,337,359,361,377,382:389,394,395)] Add Age variable based on today??df_topic_edit1["Age"] <- as.numeric(today() - ymd(as.character(df_topic_edit1$Date_Of_Birth__c)))/365 Replace NA's with 0'sfor (i in 1:nrow(df_topic_edit1)){ Bin the ages according to some standard I forget the name of!and YES, these should probably be functions with lapply not for loopsdf_topic_edit1["Age_bin"] <- "" for (i in 1:nrow(df_topic_edit1)){ df_topic_edit1$Age_bin <- as.factor(unlist(df_topic_edit1$Age_bin)) Create Days in Programdf_topic_edit1["Days_in_Program"] <- NA for (i in 1:nrow(df_topic_edit1)){ Get rid of rows that are not actual clientsdf_clients <- filter(df_topic_edit1, Client__c == 1) Get rid of a weird outlier rowdf_clients_no <- filter(df_clients, Days_in_Program < 73000 & Days_in_Program >0) Plot days in program for spouses vs vetsggplot(df_clients_no, aes(x = Military_Spouse_Caregiver__c, y = Days_in_Program)) + geom_boxplot() Create a df of spouse datadf_contact_spouses <- filter(df_topic_edit1, Military_Spouse_Caregiver__c ==1 & Client__c == 1) Create a df of vet datadf_contact_vets <- filter(df_topic_edit1, Military_Spouse_Caregiver__c ==0 & Client__c == 1) Plot various spouse demographicsggplot(df_contact_spouses) + geom_bar(aes(x = MailingState))+ theme(axis.text.x = element_text(angle = 90, hjust = 1)) df_contact_spouses_gender <- filter(df_contact_spouses, Gender__c !="" & Gender__c != "--None--") df_contact_spouses_race <- filter(df_contact_spouses, Race__c !="") ggplot(df_contact_spouses) + geom_bar(aes(x = Highest_Level_of_Education_Completed__c))+ theme(axis.text.x = element_text(angle = 90, hjust = 1)) df_contact_spouses_branch <- filter(df_contact_spouses, Service_Branch__c !="") df_contact_spouses_status <- filter(df_contact_spouses, Service_Members_Status__c !="") df_contact_spouses_age <- filter(df_contact_spouses, Age_bin !="Underage") |
This is Awesome! Thanks for posting. Unless you wanted/planned to I'll toss the top few lines of this into the cleaner saved in here and print out the joined dataset which might just give us #13 ! Or it will at least give us a severely reduced size dataset to explore with. Thanks again! |
Just remember that I was selecting variables based on the Serving Spouses
questions.
…--
*Mitch Beckner*
USA Cycling Official
#183739
Ohio Cycling Association
Officials Coordinator
937-875-0081
On Mon, Mar 18, 2019 at 7:50 PM Andrew Trick ***@***.***> wrote:
This is Awesome! Thanks for posting. Unless you wanted/planned to I'll
toss the top few lines of this into the cleaner saved in here and print out
the joined dataset which might just give us #13
<#13> !
Or it will at least give us a severely reduced size dataset to explore
with. Thanks again!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AjJ74WFYssO5kpx2AZhuZbNEhx5IVxNsks5vYCY7gaJpZM4b30YA>
.
|
And yeah, that's fine! Some of that does really go in the cleaner.
…--
*Mitch Beckner*
USA Cycling Official
#183739
Ohio Cycling Association
Officials Coordinator
937-875-0081
On Mon, Mar 18, 2019 at 7:53 PM Mitchell Beckner ***@***.***> wrote:
Just remember that I was selecting variables based on the Serving Spouses
questions.
--
*Mitch Beckner*
USA Cycling Official
#183739
Ohio Cycling Association
Officials Coordinator
937-875-0081
On Mon, Mar 18, 2019 at 7:50 PM Andrew Trick ***@***.***>
wrote:
> This is Awesome! Thanks for posting. Unless you wanted/planned to I'll
> toss the top few lines of this into the cleaner saved in here and print out
> the joined dataset which might just give us #13
> <#13> !
>
> Or it will at least give us a severely reduced size dataset to explore
> with. Thanks again!
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#9 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AjJ74WFYssO5kpx2AZhuZbNEhx5IVxNsks5vYCY7gaJpZM4b30YA>
> .
>
|
GTG! |
Split out from #3, we need to figure out how to track time to hire to answer business problem revolving around demo differences and hiring time #2.
As @mitchb63 mentioned, there are time_to_(color) variables in df_contact we can use for this.
Should time to blue be the dependent for this question? Or should we create one based on date user account was created to date turned blue? OR some other method?
The text was updated successfully, but these errors were encountered: